Values begin with 'b' when reading an arff file to pandas dataframe

713 Views Asked by At

I'm reading in this arff file to a pandas dataframe in Colab. I've used the following code, which seems to be fairly standard, from what a quick scan of top search results tells me.

from scipy.io.arff import loadarff 

raw_data = loadarff('/speeddating.arff')
df = pd.DataFrame(raw_data[0])

When I inspect the dataframe, many of the values appear in this format: b'some_text'.

When I call type(df.iloc[0,0]) it returns bytes.

What is happening, and how do I get it to not be that way?

1

There are 1 best solutions below

0
NaiveBae On

If anyone else stumbles upon this question, I found it answered here: Letter appeared in data when arff loaded into Python