Remove all but in list in Pandas

19 Views Asked by At

I have 5 sample rows in my genres column in my dataframe

Young Adult|Fiction|Science Fiction|Dystopia|Fantasy|Science Fiction
Fantasy|Young Adult|Fiction
Classics|Fiction|Historical|Historical Fiction|Academic|School
Classics|Fiction|Romance
Young Adult|Fantasy|Romance|Paranormal|Vampires|Fiction|Fantasy|Paranormal

I want to remove all of the genres except those specified in my genre_list

genre_list = ['Romance', 'Fantasy', 'Mystery', 'Science Fiction', 'Young Adult', 'Classics', 'Fiction', 'School', 'Middle Grade', 'Thriller']

Do note that my dataframe has 54301 rows, so it needs to be able to go through alot of data

I have tried this solution provided by ChatGPT

def filter_genres(genre_string):
    genres = genre_string.split('|')
    filtered_genres = [genre for genre in genres if genre in genre_list]
    return '|'.join(filtered_genres)

# Apply filtering to the 'genres' column
df['filtered_genres'] = df['genres'].apply(filter_genres)

print(df)

but to no avail

0

There are 0 best solutions below