I have a dataframe representing paneldata with ids and time. For each id and year I like to calculate a cummulative sum. But id and time groups are repeated.
df <- data.frame(id = rep(12345,15),
year = c(rep(2021,5), rep(2022,5), rep(2023,5)),
value = c(rep(5,5), rep(3,5), rep(4,5)))
df
id year value
1 12345 2021 5
2 12345 2021 5
3 12345 2021 5
4 12345 2021 5
5 12345 2021 5
6 12345 2022 3
7 12345 2022 3
8 12345 2022 3
9 12345 2022 3
10 12345 2022 3
11 12345 2023 4
12 12345 2023 4
13 12345 2023 4
14 12345 2023 4
15 12345 2023 4
So instead of this
df %>%
group_by(id, year) %>%
mutate(value_cumsum = cumsum(value))
# A tibble: 15 × 4
# Groups: id, year [3]
id year value value_cumsum
<dbl> <dbl> <dbl> <dbl>
1 12345 2021 5 5
2 12345 2021 5 10
3 12345 2021 5 15
4 12345 2021 5 20
5 12345 2021 5 25
6 12345 2022 3 3
7 12345 2022 3 6
8 12345 2022 3 9
9 12345 2022 3 12
10 12345 2022 3 15
11 12345 2023 4 4
12 12345 2023 4 8
13 12345 2023 4 12
14 12345 2023 4 16
15 12345 2023 4 20
I need value_cumsum to be 5, 5, 5, 5, 5, 8, 8, 8, 8, 8, 12, 12, 12, 12, 12.
you can try
or
which gives