I need to update Data of a subset of my df with values calculated out of a subset of my dataframe.
| PERSNR | XYZ | DATE | VALUE |
|---|---|---|---|
| 22222 | a | Jan | 0,8 |
| 22222 | b | Jan | 0,2 |
| 22222 | a | Feb | 0,8 |
| 22222 | b | Feb | 0,2 |
i have got a Dataframe with like 8000 PERSNR that looks like this, just with many more columns
For a subset, which i already know (i got a list of PERSNR to filter that) i need to add the Values together, ignoring XYZ
i Can do that with pivot_table and can get another df out of it.
The problem i have got is, that i need to update the values afterwards in my original DF.
My goal is to obtain the following without loosing any other data of my DF.
| PERSNR | XYZ | DATE | VALUE |
|---|---|---|---|
| 22222 | a | Jan | 1,0 |
| 22222 | b | Jan | 0,0 |
in Excel i would do a sumifs on my Value Column on my PERSNR, XYZ and DATE.
and then i would replace my values with vlookup in the Value column and overwrite Values with identifier XYZ "b" with zero. This will always be zero.
Any suggestions on how to do that?
thanks in advance
Mike
Assuming a data frame like this:
Merge two data frames: (i) the original minus column
VALUEand (ii) a data frame where you group onPERSNR(assuming there will be more than one value for this; otherwise, this column isn't necessary) andDATE, then sum theVALUEin each group and reset the index.Set column
VALUEwith0wheredf["XYZ"] == "b".Use
.locto do the previous steps only for the selectedPERSNRvalues.