TypeError: category type does not support sum operations (in pandas)

859 Views Asked by At

I have flight's seaborn dataset.

import seaborn as sns

flights = sns.load_dataset('flights') 
flights.groupby(['year']).sum()

when i run this, i get error like : TypeError: category type does not support sum operations

facing this issues in clusteMap and Lineplot

your Assistence will be Appreciated!

2

There are 2 best solutions below

0
webelo On BEST ANSWER

This snippet works in pandas 1.* but not in pandas 2.

import seaborn as sns

flights = sns.load_dataset('flights') 
flights.groupby(['year']).sum() # Error

The issue is that the month column has type category:

flights.info(True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 144 entries, 0 to 143
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   year        144 non-null    int64   
 1   month       144 non-null    category
 2   passengers  144 non-null    int64   
dtypes: category(1), int64(2)
memory usage: 2.9 KB

In pandas 1.*, the month column is automatically dropped because its type does not support the sum method.

To get to the same result in pandas 2, you'll want to specifically select the passengers column (and any other col of interest):

flights.groupby('year')[['passengers']].sum()

yields:

      passengers
year            
1949        1520
1950        1676
1951        2042
1952        2364
1953        2700
1954        2867
1955        3408
1956        3939
1957        4421
1958        4572
1959        5140
1960        5714
1
Smordy On

you should try to convert the year column to a numerical or integer type

flights['year'] = flights['year'].astype(int)
result = flights.groupby(['year']).sum()