Why does the code jump to the last line of the function?

81 Views Asked by At

This is a function that exports an array to an Excel spreadsheet with Panda's ExcelWriter.

Or at least it used to. It has worked fine for a few years.

Now, once it gets to the line opening the ExcelWriter: with pd.ExcelWriter(excelpath) as exwriter It runs the next line: format_link = exwriter.book.add_format() Then it skips everything else and jumps to the last line of the function: worksheet.set_row(row, None, row_color)

This jump ignores everything in between - if statements, loops, etc. I put debugger breakpoints on every line between the ExcelWriter creation and the end and it skipped them all.

I'm using PyCharm 2023.3.2 Community Edition as my IDE. It is the most recent version. The interpreter is Python 3.9.13

Any ideas?

The Function:

def make_excel_array(the_array, headings, file_name, path="/Users/jrfreeze/Documents/DS_data/",
                     tab="Sheet1", col_format=True, color_rows=()):
    """
    Accepts array of data and headings for excel sheet, exports excel workbook
    :param the_array: list or lists of data
    :param headings: list of the excel columns, send 0 or "" or [] to ignore
    :param file_name: string desired name of the excel file
    :param tab: string name on tab
    :param path: string directory to pace the Excel file
    :param col_format: boolean format column width to widest entry up to max of 50
    :param color_rows: tuple of tuple of tuple and str. One middle tuple for each set of rows to color.
                    Inner tuple is row numbers to color; str is color to apply; e.g. (((1,2), 'red'), ((3,4), 'blue))
    :return:
    """
    import inspect
    excelpath = path + file_name + ".xlsx"
    if headings:
        df = pd.DataFrame(the_array, columns=headings)
    else:
        df = pd.DataFrame(the_array)
    if col_format:
        max_lens = get_max_lens(the_array, headings)
    else:
        max_lens = []
    with pd.ExcelWriter(excelpath) as exwriter:
        format_link = exwriter.book.add_format()
        format_link.set_font_color('blue')
        if headings:
            df.to_excel(exwriter, sheet_name=tab, index=False)
        else:
            df.to_excel(exwriter, sheet_name=tab, index=False, header=False)
        worksheet = exwriter.sheets[tab]
        caller = inspect.currentframe().f_back.f_code.co_name
        if caller == "get_lsdyna_user_jobs":
            worksheet.write(1, 1, df.iloc[0, 1], format_link)
            worksheet.write(1, 2, df.iloc[0, 2], format_link)
        if col_format:
            for i in range(len(max_lens)):
                worksheet.set_column(i, i, max_lens[i])
        if color_rows:
            for rows_format in color_rows:
                row_color = exwriter.book.add_format()
                row_color.set_font_color(rows_format[1])
                for row in rows_format[0]:
                    worksheet.set_row(row, None, row_color)

The function call with parameter values:

array = [['abd', 1], ['def', 2]]
headers = ['letters', 'number']
excelpath = '/Users/jrfreeze/Documents/DS_Quarterly_Reports/PY4_Q1/'
filename = 'testsheet'
make_excel_array(array, headers, filename, path=excelpath)

Normally, this exports the array to an Excel spreadsheet with the name given and the path provided in the parameters. Now it skips most of the main part of the code and then throws an error: IndexError: At least one sheet must be visible

I tried commenting out the last block starting with: if color_rows: But it did the same thing, jumping to the new last line of the function: worksheet.set_column(i, i, max_lens[i])

I'm guessing it throws the error because the line converting the dataframe (df) to Excel format: df.to_excel(exwriter, sheet_name=tab, index=False) was skipped but I have no idea why it skipping lines of code.

2

There are 2 best solutions below

6
Joey On

Firstly, you may need to update pandas and openpyxl.

pip install --upgrade pandas openpyxl

The with keyword indicates that pd.ExcelWriter is a context manager. Context managers will typically handle things like cleanup of functions (i.e. closing files) so you don't have to explicitly handle those tasks.

It could be that the context manager is defaulting to a different engine, see documentation. You can specify the engine yourself by changing the following line:

with pd.ExcelWriter(excelpath) as exwriter:

to:

with pd.ExcelWriter(excelpath, engine='openpyxl') as exwriter:
0
Joshua On

Well this is embarrassing. I managed to overlook the initial error in the stack trace. AttributeError: 'Workbook' object has no attribute 'add_format'

Pandas can use different engines to create Excel sheets. I think I knew this at one time but hadn't thought about it in a few years. One engine is openpyxl. Another is xlsxwriter. Not long ago I got a new laptop and apparently didn't install xlsxwriter on it so pandas defaulted to openpyxl which doesn't have the .add_format() function for workbooks. For some reason I don't fully understand, in handling that error, it jumped to the last line of code causing the IndexError.

Installing xlsxwriter and setting it as the engine resolved everything.

My apologies. Moral of the story: read the stack trace carefully.