pandas split-apply-combine creates undesired MultiIndex

146 Views Asked by At

I am using the split-apply-combine pattern in pandas to group my df by a custom aggregation function. But this returns an undesired DataFrame with the grouped column existing twice: In an MultiIndex and the columns.

The following is a simplified example of my problem.

Say, I have this df

df = pd.DataFrame([[1,2],[3,4],[1,5]], columns=['A','B']))

   A  B
0  1  2
1  3  4
2  1  5

I want to group by column A and keep only those rows where B has an even value. Thus the desired df is this:

   B
A        
1  2
3  4

The custom function my_combine_func should do the filtering. But applying it after a groupby, leads to an MultiIndex with the former Index in the second level. And thus column A existing two times.

my_combine_func = group[group['B'] % 2 == 0]

df.groupby(['A']).apply(my_combine_func)

     A  B
A        
1 0  1  2
3 1  3  4

How to apply a custom group function and have the desired df?

1

There are 1 best solutions below

0
On

It's easier to use apply here so you get a boolean array back:

df[df.groupby('A')['B'].apply(lambda x: x % 2 == 0)]

   A  B
0  1  2
1  3  4