I am working in Pandas, I want to run a loop with a query on a list of tables in database and write the result of each loop in a separate sheet in Excel, but I get an error 'File is not a zip file' Could you help resolve it and help me make my code work? Maybe you have another workaround not using openpyxl?
My current code:
def create_workbook(path):
workbook = Workbook()
workbook.save(path)
create_workbook('result.xlsx')
def function_1(table, file):
book = load_workbook(file)
writer = pd.ExcelWriter(file, engine = 'openpyxl')
writer.book = book
query = f"select * from {table}"
df = pd.read_sql_query(query , engine)
df.to_excel(writer, sheet_name = table)
writer.close()
for table in tables_list:
function_1(table, 'result.xlsx')
You are opening the newly created Excel file 'results.xlsx' with Openpyxl and then overwriting this file with a new empty file with ExcelWriter.
This is unnecessary.
And perhaps
writer.book = bookhas an issue since when it's run 'result.xlsx' is no longer an XLSX file it is an empty file 0 bytes in size.Either way as mentioned; opening the file with Openpyxl is unnecessary and shouldn't be done.
When ExcelWriter opens the file it uses Openpyxl to do so if you have specified the engine to be Openpyxl (otherwise it uses Xlsxwriter by default or other specified module ). There is no requirement to load the file using the engine module first and this could cause issues.
In your scenario it is also not necessary to create the file before hand since ExcelWriter default is 'write' mode and it will create that file itself. Therefore the 'create_workbook' function can be done away with.
All that's needed is the code below
'result.xlsx' will be created by the line
writer = pd.ExcelWriter(file, engine='openpyxl')If you use a context manager you can also do away with the
close()too.