Update: I mixed up the axis. Thats why my question seems weired. This is the new input data:
give following data:
data = {'Org': ['Tom', 'Kelly', 'Rick', 'Dave','Sara','Liz'],
'A': ['NaN', 1, 1, 1, 'NaN', 'NaN'],
'B': [1, 1, 1, 1, 'NaN', 1],
'C': [1, 1, 1, 1, 1, 1],
'D': ['NaN', 'NaN', 1, 'NaN', 1, 'NaN'],
'E': [1, 1, 1, 1, 'NaN', 1],
'F': ['NaN', 1, 1, 1, 'NaN', 1]}
df = pd.DataFrame(data)
I want to sum the columns except the first two and then replace the values not NaN with the sum the column results:
the result should like this:
data = {'Org': ['Tom', 'Kelly', 'Rick', 'Dave','Sara','Liz'],
'A': ['NaN', 1, 1, 1, 'NaN', 'NaN'],
'B': [5, 5, 5, 5, 'NaN', 5],
'C': [6, 6, 6, 6, 6, 6],
'D': ['NaN', 'NaN', 2, 'NaN', 2, 'NaN'],
'E': [5, 5, 5, 5, 'Nan', 5],
'F': ['NaN',4, 4, 4, 'NaN', 4]}
df = pd.DataFrame(data)
I tried:
column_sums = df.iloc[:, 2:].sum()
for column in iloc[:, 2:].columns:
df[column] = column_sums[column]
But that replaces me all values.
Is there a smooth solution for that possible?
Thanks
Build as mask,
sumand modify in place after broadcasting the sum:Output:
Intermediate
m: