Pandas SettingWithCopyWarning is killing me

282 Views Asked by At

I try to filter Pandas DataFrame:

df = pd.read_csv('ml_data.csv', dtype=str)

def df_filter(df):
    #df = df.copy()

    df.replace('(not set)', '(none)', inplace=True) #comment this and warning will disappear!!!
    df = df[df['device_browser'] != '(none)'] #comment this and warning will disappear!!!

    def browser_filter(s): 
        return ''.join([c for c in s if c.isalpha()])
    df['device_browser'] = df['device_browser'].apply(browser_filter)

    return df

df = df_filter(df)

And I receive this warning:


/tmp/ipykernel_2185/1710484338.py:11: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['device_browser'] = df['device_browser'].apply(browser_filter)

But if I uncomment

#df = df.copy() 

OR comment

df.replace('(not set)', '(none)', inplace=True) 

OR comment

df = df[df['device_browser'] != '(none)']

OR will not wrap filtering in df_filter function

this warning will disappear!!! WHY??????????

I danced around the fire and beat the tambourine...

1

There are 1 best solutions below

3
On

Because by doing df.copy() you create a deep copy of our dataframe, you can see that in the documentation, deep = True by default.

So if you create a deep copy of your base dataframe, the warning will disappear.

But, if you don't, you will create shallow copy using:df.replace('(not set)', '(none)', inplace=True).
And after you try to filter a shallow copy using df = df[df['device_browser'] != '(none)'], that why you have this warning. So if you remove one the two lines, it is logic that you don't have the warning.

I invite you to check the difference between shallow and deep copy on this stackoverflow question.