Import MDB file into Python (pandas) on Mac

4.8k Views Asked by At

I'm running python 3.6 on a mac. I have downloaded an mdb file but do not have Microsoft access, I'd like to import each table into python and work with it there.

I have installed mdbtools and run the following from Spyder:

import pandas as pd
import subprocess
import os

os.chdir('<directory where mdb file is>')

def show_tables(path='avroll_19.mdb'):
    tables = subprocess.check_output(["mdb-tables", path])
    return tables.decode().split()

show_tables()

I get this error: FileNotFoundError: [Errno 2] No such file or directory: 'mdb-tables': 'mdb-tables'

I have also tried this, but get the same error:

import pandas_access as mdb
for tbl in mdb.list_tables('avroll_19.mdb'):
    print(tbl)

I am using Sypder within Anaconda, I am not sure if that is an issue.

The mdb file is located here: https://www1.nyc.gov/assets/finance/downloads/tar/avroll_20.zip

I also attempted to do this using pyodbc, however, it appears that the driver needed for it is not available for mac.

Thank you for your help.

5

There are 5 best solutions below

0
mks212 On

I have a workaround using R rather than Python. I have referenced material from this post: https://medium.com/@wenyu.z/reading-ms-access-mdb-files-on-mac-969a176baa7a.

First, in terminal run: brew install mdbtools. Please note, this requires that homebrew is already installed.

Second, in R run this:

library(Hmisc)
data <- mdb.get('avroll_19.mdb')

Of course, substitute avroll_19.mdb with your database filename.

0
Vignesh On

I am also getting same file not found error. it is becuase my access data source is older 32bit .mdb and my python is 64 bit.

its working in 32 bit machines

1
Gord Thompson On

I have just confirmed that the following approach works with current versions of pandas, JayDeBeApi, and the UCanAccess JDBC driver. For more details on how to set up the Java/UCanAccess environment, see this answer.

import jaydebeapi
import pandas as pd

db_path = "/home/gord/UCanAccessTest.accdb"

ucanaccess_jars = [
    "/home/gord/Downloads/JDBC/UCanAccess/ucanaccess-5.0.0.jar",
    "/home/gord/Downloads/JDBC/UCanAccess/lib/commons-lang3-3.8.1.jar",
    "/home/gord/Downloads/JDBC/UCanAccess/lib/commons-logging-1.2.jar",
    "/home/gord/Downloads/JDBC/UCanAccess/lib/hsqldb-2.5.0.jar",
    "/home/gord/Downloads/JDBC/UCanAccess/lib/jackcess-3.0.1.jar",
]
classpath = ":".join(ucanaccess_jars)
cnxn = jaydebeapi.connect(
    "net.ucanaccess.jdbc.UcanaccessDriver",
    f"jdbc:ucanaccess://{db_path}",
    ["", ""],
    classpath,
)

df = pd.read_sql_query("SELECT * FROM Clients", cnxn)
print(df)
"""console output:
   ID      LastName FirstName                  DOB
0   1      Thompson      Gord  2017-04-01 07:06:27
1   2        Loblaw       Bob  1996-09-12 16:03:00
"""

Note that this works for reading from Access into pandas but will not work for writing back to Access using to_sql.

0
breizhmg On

I got pandas_access to work following this thread. The issue was that Python required Windows to run mdb-tables.exe, which it did not find. The fix was as simple as:

  • downloading mdbtools-win executables stored on this github
  • adding the folder with the executables to the Windows PATH

Then magically pandas_access.read_table started working in Python:

Before:

IPdb [13]: pandas_access.read_table('Database.accdb', "Table"])
*** FileNotFoundError: [WinError 2] The system cannot find the file specified

After:

IPdb [14]: pandas_access.read_table('Database.accdb', "Table"])
      ID      Table
0      1         A
1      2         B
2      3         S
3      4         X
4      5         Z
...
0
Rachel On

First one was super close, you forgot to add the path to chdir and the actual file name should be where the path= is like this

import subprocess import os

os.chdir('Whole_path_goes_here_minus_file_name')

    enter code here


def show_tables(path='File_name_goes_here'):
    tables = subprocess.check_output(["mdb-tables", path])
    print(tables)

    return tables.decode().split()


show_tables()

Console:

OUTPUT: b'ADM2019 C2019_A C2019_B C2019_C C2019DEP CUSTOMCGIDS2019 DRVAL2019 DRVC2019 DRVEF122019 DRVEF2019 DRVF2019 DRVGR2019 DRVIC2019 DRVOM2019 EAP2019 EF2019 EF2019A EF2019A_DIST EF2019B EF2019C EF2019D EFFY2019 EFIA2019 F1819_F1A F1819_F2 Filenames19 FLAGS2019 GR200_19 GR2019 GR2019_L2 HD2019 IC2019 IC2019_AY IC2019_PY IC2019Mission OM2019 S2019_NH S2019_OC S2019_SIS SAL2019_IS SAL2019_NIS SFA1819_P1 SFA1819_P2 SFAV1819 Tables19 Tables19_RV valuesets19 AL2019 DRVADM2019 DRVHR2019 F1819_F3 GR2019_PELL_SSL S2019_IS sectiontable19 vartable19 \n'