R label sequence by changing factor values

63 Views Asked by Jason Richardson At 10 March 2022 at 19:42

I have the following data:

Account	date	type
1	2021-08-31	0
1	2021-09-23	0
1	2021-09-30	5
1	2021-10-30	0
1	2021-12-29	0
1	2022-01-31	8
1	2022-02-02	0

I need to find the minimum date of each individual transition.

group_by(Account, type) %>%
summarise(first_appearance = min(date))

returns

Account	date	type
1	2021-08-31	0
1	2021-09-30	5
1	2022-01-31	8

How can I group by each SHIFT in type?

My initial thoughts are to generate some sort of sequence along the factors and concatenate to have a unique grouping variable, but how would this be done?

Account	date	type	order	type_order
1	2021-08-31	0	A	0A
1	2021-09-23	0	A	0A
1	2021-09-30	5	A	5A
1	2021-10-30	0	B	0B
1	2021-12-29	0	B	0B
1	2022-01-31	8	A	8A
1	2022-02-02	0	C	0C

Desired output would be:

group_by(Account, type_order) %>%
summarise(first_appearance = min(date))

Account	date	type	order	type_order
1	2021-08-31	0	A	0A
1	2021-09-30	5	A	5A
1	2021-10-30	0	B	0B
1	2022-01-31	8	A	8A
1	2022-02-02	0	C	0C

Original Q&A

There are 1 best solutions below

Ben On 10 March 2022 at 21:04 BEST ANSWER

Maybe use rleid from data.table to assign groups when there are differences in type from row to row.

library(tidyverse)
library(data.table)

df %>%
  group_by(Account, grp = rleid(type), type, order) %>%
  summarise(first_appearance = min(date))

Output

  Account   grp  type order first_appearance
    <int> <int> <int> <chr> <chr>           
1       1     1     0 A     2021-08-31      
2       1     2     5 A     2021-09-30      
3       1     3     0 B     2021-10-30      
4       1     4     8 A     2022-01-31      
5       1     5     0 C     2022-02-02

R label sequence by changing factor values

There are 1 best solutions below

Related Questions in R

Related Questions in DPLYR

Related Questions in SEQ

Trending Questions

Popular # Hahtags

Popular Questions