How to filter NaN values using Pandas?

99 Views Asked by Denis Tocilin At 14 February 2024 at 15:50

Imagine the following dataframe:

import pandas as pd
df = pd.DataFrame(columns = ['A','B','C'], index = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday'])
df.loc[['Monday', 'Friday'], ['A', 'C']] = 19

I want to print all columns and rows, that are not NaN. The output should be as follows:

	A	C
Monday	19	19
Friday	19	19

I have tried different methods, but none of them gives me the required result. I am doing something wrong here. Please help.

df = df[df.isna()]
df[(df.iloc[::]) & (df.isnull() == False)]

Original Q&A

There are 2 best solutions below

Vikas Sharma On 14 February 2024 at 16:12

Your code:

import pandas as pd
df = pd.DataFrame(columns = ['A','B','C'], index = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday'])
df.loc[['Monday', 'Friday'], ['A', 'C']] = 19
print(df)

Output:

             A    B    C
Monday      19  NaN   19
Tuesday    NaN  NaN  NaN
Wednesday  NaN  NaN  NaN
Thursday   NaN  NaN  NaN
Friday      19  NaN   19

Add this to your code:

df = df.stack().unstack()
print(df)

Output:

         A   C
Monday  19  19
Friday  19  19

mozway On 14 February 2024 at 18:43

Assuming you want to remove rows/columns that are fully composed of NaNs use dropna on both axes:

out = df.dropna(how='all').dropna(how='all', axis=1)

Output:

         A   C
Monday  19  19
Friday  19  19

Timings

Comparison with df.stack().unstack() for a 1000x1000 DataFrame (50% NaNs):

# df.stack().unstack()
76 ms ± 3.73 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

# df2.dropna(how='all').dropna(how='all', axis=1)
3.82 ms ± 84.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each

How to filter NaN values using Pandas?

There are 2 best solutions below

Timings

Related Questions in PYTHON-3.X

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in FILTERING

Related Questions in NAN

Trending Questions

Popular # Hahtags

Popular Questions