Groupby 3 Columns and Create Column that Calculates Percentage Contribution

25 Views Asked by At

I have the following df:

    Region   Country  Category  Number
0   NA       CA       A         2
1   NA       CA       B         1
2   NA       US       A         2
3   NA       US       B         3
4   NA       MX       A         2
5   NA       MX       B         1
0   EU       FR       A         2
1   EU       FR       B         1
2   EU       NL       A         1
3   EU       NL       B         2
4   EU       DK       A         1
5   EU       DK       B         2

I am trying to create a new column that contains the relative contributions of a Country-Category to the overall Region.

For example, the NA-CA-A combination would have a contribution of 2/(2+2+2) or (NA-CA-A)/((NA-CA-A)+(NA-US-A)+(NA-MX-A)) = .333.. or 33.33...%.

Output would look like:

    Region   Country  Category  Population  Number
0   NA       CA       A         2           .33
1   NA       CA       B         1           .2
2   NA       US       A         2           .33
3   NA       US       B         3           .6
4   NA       MX       A         2           .33
5   NA       MX       B         1           .2
0   EU       FR       A         2           .5
1   EU       FR       B         1           .2
2   EU       NL       A         1           .25
3   EU       NL       B         2           .4
4   EU       DK       A         1           .25
5   EU       DK       B         2           .4

I have tried the following code with no luck:

df['Contribution'] = df.groupby(['Region','Country,'Category'])['Number'].apply(lambda x: x*100/x.sum())

Thank you for your help!

0

There are 0 best solutions below