Using groupby and cumsum to get a new column in pandas

58 Views Asked by At

I have the following dataframe:

Class Received Issued
FD 10 0
FD 0 2
RM 5 0
RM 0 3
FD 0 2
PM 5 0
PM 1 0
RM 1 0
FD 4 0

I require the dataframe below:

Class Received Issued Remaining Quantity
FD 10 0 10
FD 0 2 8
RM 5 0 5
RM 0 3 2
FD 0 2 6
PM 5 0 5
PM 1 0 6
RM 1 0 3
FD 4 0 10

The remaining quantity column is cumsum() of received - issued per class. I have tried different methods but I'm not getting it.

4

There are 4 best solutions below

2
Mark On
df['Remaining Quantity'] = df.groupby('Class').apply(
    lambda x: x['Received'].cumsum() - x['Issued'].cumsum()
    ).reset_index(level=0, drop=True)

Output:

  Class  Received  Issued  Remaining Quantity
0    FD        10       0                  10
1    FD         0       2                   8
2    RM         5       0                   5
3    RM         0       3                   2
4    FD         0       2                   6
5    PM         5       0                   5
6    PM         1       0                   6
7    RM         1       0                   3
8    FD         4       0                  10
0
Timeless On

Another possible solution :

df["Remaining Quatity"] = (
    df.eval("tmp=Received-Issued").groupby("Class")["tmp"].cumsum()
)

Output :

print(df)

  Class  Received  Issued  Remaining Quatity
0    FD        10       0                 10
1    FD         0       2                  8
2    RM         5       0                  5
3    RM         0       3                  2
4    FD         0       2                  6
5    PM         5       0                  5
6    PM         1       0                  6
7    RM         1       0                  3
8    FD         4       0                 10
3
Andrej Kesely On

Another solution:

df["Remaining Quatity"] = (g := df.groupby("Class").cumsum())["Received"] - g["Issued"]
print(df)

Prints:

  Class  Received  Issued  Remaining Quatity
0    FD        10       0                 10
1    FD         0       2                  8
2    RM         5       0                  5
3    RM         0       3                  2
4    FD         0       2                  6
5    PM         5       0                  5
6    PM         1       0                  6
7    RM         1       0                  3
8    FD         4       0                 10

OR: Using .pipe:

df["Remaining Quatity"] = df.groupby("Class").cumsum().pipe(lambda g: g["Received"] - g["Issued"])

OR: Using .eval:

df["Remaining Quatity"] = df.groupby("Class").cumsum().eval("Received - Issued")
0
Umar.H On

One way using .stack to compute the difference and then assigning the value back along the index.

df['Remaining Quality'] = df.assign(
            Issued=df['Issued'] * -1).set_index('Class',append=True)\
           .stack().groupby(level=1).cumsum().unstack(-1).droplevel(1,0)['Issued']

print(df)

  Class  Received  Issued  Remaining Quality
0    FD        10       0                 10
1    FD         0       2                  8
2    RM         5       0                  5
3    RM         0       3                  2
4    FD         0       2                  6
5    PM         5       0                  5
6    PM         1       0                  6
7    RM         1       0                  3
8    FD         4       0                 10