Problem with .xls file validation on e-commerce platform

44 Views Asked by At

you may have noted that this is a long question, that was because I really put an effort to explain how many WTF's I am facing, and, maybe, is not that good yet, anyway, I appreciate your help!

Context

I'm doing an integration project for a client that handles a bunch of data to generate Excel files in .xls format, notice that extension!

While developing the project I was using the xlrd and xlwt python extensions, because, again, I need to create a .xls file. But at some time I had to download and extract a file and was in .csv format (but, in reality, the file contains an HTML table :c).

So I decided to use padas to read the HTML, create a data frame so I can manipulate and return a .xls excel file.

The Problem

after coding the logic and checking that the data was correct, I tried to upload this file to the e-commerce plataform.

What happened is that the platform doesn't validate my archive.

First I will briefly explain how the site work: He accepts .xls and only .xls file, probably manipulate and use them to update the database, I have access to nothing from the code source.

When I upload the file, the site takes me to a configuration page where, if I want or the site didn't relate right, I could relate excel columns to be the id or values that would be updated on the database.

Image showing the difference from a slightly different file.

The 'generico4' field expects 'smallint(5) unsigned' on the type.

An important fact is that I sent the file to my client so he could validate the data, and after many conversations between us was discovered that if he, just by downloading my file, opening, and saving, the upload works fine (the second image from my slide), important to note that he has a MacBook and me, Ubuntu. I tried to do the same thing but not worked.

He sent me this file and I tried to see the difference between both and I found nothing, the type of the numbers are the same, that is 'float', and printed via excel with the formula =TYPE(cell) returned 1.

I already tried many other things but nothing works :c

The code

Follow the code so you can have an idea of the logic

def stock_xls(data_file_path):
    # This is my logic to manipulate the data
    df = pd.read_html(data_file_path)[0]
    df = df[[1,2]]
    df.rename(columns={1:'sku', 2:'stock'}, inplace=True)
    df = df.groupby(['sku']).sum()
    df.reset_index(inplace=True)
    df.loc[df['stock'] > 0, 'stock'] = 1
    df.loc[df['stock'] == 0, 'stock'] = 2

    # I create a new Worbook (via pandas was not working too)
    wb_out = xlwt.Workbook()
    ws_out = wb_out.add_sheet(sheetname='stock')

    # Set the columns name
    ws_out.write(0, 0, 'sku')
    ws_out.write(0, 1, 'generico4')

    # Copy DataFrame data to the WorkBook
    for index, value in df.iterrows():
        ws_out.write(index + 1, 0, str(value['sku']))
        ws_out.write(index + 1, 1, int(value['stock']))
    
    path = os.path.join(BASE_DIR, f'src/xls/temp/')
    Path(path).mkdir(parents=True, exist_ok=True)
    
    file_path = os.path.join(path, "stock.xls")
    wb_out.save(file_path)
    
    return file_path
0

There are 0 best solutions below