read_excel function changes some characters to unicode ones since pandas package upgrade

515 Views Asked by HelloSmacl At 17 November 2025 at 00:23

I have upgraded pandas package, new version is : 1.4.2 and xlrd package, new version is 2.0.1

Now, when I read python file with the following command :

import pandas as pd pd.read_excel('myfile.xlsx')

I got the following warning: UserWarning: Workbook contains no default style, apply openpyxl's default

And my result is :

foo	bar
_x0031_01259	COMMUNAUTE_x0020_DE_x0020_COMMUNES_x0020_FIER_x0020_ET_x0020_USSES

While it should be :

foo	bar
101259	COMMUNAUTE DE COMMUNES FIER ET USSES

So in some columns, 1 is replaced by x0031, 2 is replaced by x0033, space is replaced by x0020, etc.

I tried to add engine parameter set to openpyxl but same warning message and same dataframe result.

Before Pandas package upgrade, I already had the problem but with engine parameter set to xlrd, it was working (but I had a warning saying that newest version won't support xlrd)

Any idea how to read correctly the file?

Original Q&A

There are 1 best solutions below

Maciej Majewski On 23 August 2023 at 06:08

xlrd package is not used for .xlsx files in pandas.

As you can read in pandas docs, for .xlsx it is using openpyxl that does not handle reading xlsx properly in some specific scenarios, causing error you have described - check this bug report

It has been working for you before, because xlrd used to support .xlsx files, but since version 2.0.0 it supports only "classic" .xls files.

As a workaround, you can use xlrd code snippet from 1.x.x version in pandas, however you have to ensure your columns does not include strings like _x0000_ that you want to preserve:

def unescape(s,
             subber=re.compile(r"_x[0-9A-Fa-f]{4,4}_", re.UNICODE).sub,
             repl=lambda mobj: chr(int(mobj.group(0)[2:6], 16))):
    if "_" in s:
        return subber(repl, s)

    return s

df["column"] = df["column"].apply(unescape)

read_excel function changes some characters to unicode ones since pandas package upgrade

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in UNIDECODER

Trending Questions

Popular # Hahtags

Popular Questions