I have this dataframe:
df = pd.DataFrame({
'ID': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
'Condition': [False, False, True, False, False, False, False, False, False, False, True, False]})
ID Condition
0 1 False
1 1 False
2 1 True
3 1 False
4 1 False
5 1 False
6 1 False
7 1 False
8 1 False
9 1 False
10 1 True
11 1 False
I want to add a new column Sequence with a sequence of numbers. The condition is when the first True appears in the Condition column, the following rows must contain the sequence 1, 2, 3, 1, 2, 3... until another True appears again, at which point the sequence is restarted again. Furthermore, ideally, until the first True appears, the values in the new column should be 0. El resultado final sería:
ID Condition Sequence
0 1 False 0
1 1 False 0
2 1 True 1
3 1 False 2
4 1 False 3
5 1 False 1
6 1 False 2
7 1 False 3
8 1 False 1
9 1 False 2
10 1 True 1
11 1 False 2
I have tried to do it with cumsum and cumcount but I can't find the exact code.
Any suggestion?
Let us do
cumsumto identify blocks of rows, thengroupthe dataframe by blocks and usecumcountto create sequential counter, then with some simple maths we can get the outputExplained
Identify blocks/groups of rows using
cumsumGroup the dataframe by the blocks and use
cumcountto create a sequential counter per blockModulo(
%) divide the sequential counter by3to create a repeating sequence that repeats every three rowsMask the values in sequence with
0where the group(b) is< 1Result