UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position Pandas

51 Views Asked by At

I am trying to convert the df to list of dictionaries but getting Unicode error:

Code:-

df = reduce(lambda df_i, df_j: pd.concat([df_i,df_j]).drop_duplicates(subset=distinct_col),
            pd.read_csv(csv_filepath,
                        encoding='latin1',
                        engine='python',
                        skipinitialspace=True,
                        skiprows = header_count,
                        usecols=read_col,
                        iterator=True,
                        header=None,
                        names=csv_col_name,
                        chunksize=2,
                        sep='\s*[;]\s*',
                        dtype=str))
                        
df.drop(df.tail(footer_count).index,inplace=True)
#getting error while converting into list of dictionaries..
csv_records=df.to_dict('records')

Error:-

  1. if encoding='latin1' then follwoing error comes: UnicodeEncodeError: 'ascii' codec can't encode character '\xfc' in position 85: ordinal not in range(128) _'ascii' codec can't encode character '\xfc' in position 85: ordinal not in range(128)_line no:171
  2. if encoding='utf-8' then follwoing error comes: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 860: invalid start byte _'utf-8' codec can't decode byte 0xfc in position 860: invalid start byte_line no:159

Please suggest how can we resolve this issue.. Thanks in advance :)

0

There are 0 best solutions below