I have the following dataset:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 513250 entries, 0 to 513249
Data columns (total 26 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Game Title 513196 non-null object
1 Game Poster 513196 non-null object
2 Game Release Date 513196 non-null object
3 Game Developer 512977 non-null object
4 Genre 513196 non-null object
5 Platforms 513250 non-null object
6 Product Rating 425527 non-null object
7 Overall Metascore 513181 non-null float64
8 Overall User Rating 513128 non-null object
9 Reviewer Name 512347 non-null object
10 Reviewer Type 513250 non-null object
11 Rating Given By The Reviewer 510449 non-null float64
12 Review Date 405044 non-null object
13 Review Text 512804 non-null object
14 0 513250 non-null object
15 1 513250 non-null object
16 2 513250 non-null object
17 3 513250 non-null object
18 4 513250 non-null object
19 5 513250 non-null object
20 6 513250 non-null object
21 7 513250 non-null object
22 8 513250 non-null object
23 9 513250 non-null object
24 10 513250 non-null bool
25 11 513250 non-null object
dtypes: bool(1), float64(2), object(23)
memory usage: 98.4+ MB
Where columns named 0 to 11 contain 'PC' values in several rows. I want to drop the lines that do not contain the PC value by index. But I tried in different ways and I didn't succeed. Columns that do not contain the PC value are set to "False".
How can I do this?
I believe that some way of iterating over the columns will help, but I tried several ways and it didn't work.
I tried:
for index, row in df.iterrows():
if 'PC' not in row.values:
df.drop(index, inplace=True)
I tried too:
df = df[df['column'] == 'PC']
# Resetando os índices após a filtragem
df.reset_index(drop=True, inplace=True)
Other solution:
column_pc = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
df = df[df[colum_pc].apply(lambda row: 'PC' in row.values, axis=1)]
df.reset_index(drop=True, inplace=True)
Assuming you want to drop all rows that contain at least one "PC" in the first 12 columns, use boolean indexing with
all:How is works: