ValueError: The feature names should match those that were passed during fit

36 Views Asked by At

this is not a question, this is a solution, but I had this problem, and I haven't found this solution here, so I am posting it.

I would have liked to create a decision tree with sklearn, but I had a column called country which wasn't an integer.

I had to cast it, but I didn't do it properly.

Learn from my fault.

I used this:

df['country'] = df_row['country'].astype('category')
df = pd.get_dummies(df_row, columns=['country'])

And when I wanted to test, I got this error: ValueError: The feature names should match those that were passed during fit.

Here is the solution:

df = pd.read_csv('data.csv', sep=';')

ohe = OneHotEncoder(handle_unknown='ignore', sparse_output= False).set_output(transform='pandas')
ohetransform = ohe.fit_transform(df[['country']])
df = pd.concat([df, ohetransform], axis=1).drop(columns=['country'])

I hope this helps you all.

Here is a video, that helped me.

0

There are 0 best solutions below