Faster / fastest way to convert pandas dataframes to word tables

80 Views Asked by At

I am trying to convert pandas dataframes to word tables. However for large dataframes the current process I'm using is extremely slow. This is because each cell has to be accessed one by one. The calling of the table.cells function within python-docx is what makes the code so slow as far as I'm aware

Is there a way to do this without having to call each cell seperately? Or is there another faster way to convert a pandas dataframe to a word table?

def add_table(df):
  table = doc.add_table(df.shape[0]+1+(df.columns.nlevels -1), df.shape[1])
  table.style = 'Table Grid'
  #Add header rows for tables with more than 1 header
  if df.columns.nlevels > 1:
    for k in range(df.columns.nlevels):
      for j, cell in enumerate(table.rows[k].cells):
        cell.text = str(df.columns[j][k])


  else:
    # add the header rows.
    for j in range(df.shape[-1]):
        table.cell(0,j).text = df.columns[j]

  # add the rest of the dataframe
  for i in range(df.shape[0]): 
      for j, cell in enumerate(table.rows[i+1+(df.columns.nlevels -1)].cells): 
          cell.text = str(df.values[i, j])

input data:

                  Numb   Description
0                 301  DESC 1
1                 302  DESC 2
2                 303  DESC 3
3                 304  DESC 4
4                 305  DESC 5
...               ...                                                ...
2131             9108  DESC 6
2132             9109  DESC 7
2133             9110  DESC 8
2134             9111  DESC 9
2135             9112  DESC 10

expected output:

Numb Description
301 Desc 1
302 Desc 2
303 Desc 3
304 Desc 4
305 Desc 5

Edit: Found a great solution which calls the table.cells function only onces, and then iterates through this list of cell objects: https://github.com/python-openxml/python-docx/issues/174

0

There are 0 best solutions below