This is my first question here, so I apologize in advance if I'm doing it wrong. I have a dataset with 20,000 observations and a dummy variable (0, 1). I want to delete rows with repeated values, but only for value of 1. I.e., if I have repeated 0s I want to keep them all. But if I have repeated 1s, I want to keep only the first one. I also want to do this sorting by group. Is it possible?
This is how my data looks now:

In this excerpt, I would like to keep all data from 1920 until 1922, drop the rows from 1923 to 1929, and keep the remaining observations.
This is what I have tried so far, but it drops all observations after the first 1, including rows with values o 0.
df %>%
arrange(country, year) %>%
group_by(country) %>%
slice(if(1 %in% event) seq(match(1, event)) else row_number()) %>%
ungroup()
Thank you!
In base R you could use a modified
sequence(rle(...))approach to identify consecutive instances ofevent, with some additional logic to meet your specific needs:If you wanted to do it by country, you'd likely want to use the
dplyrapproach with.by = country: