Expand rows by date range using start and end date to have a variable observed for each quarter

55 Views Asked by At

I'm trying to make a new column on my table to observe every quarter where a ID has been marked with a certain Status. (note: data set ends in Oct 2023). Therefore I would probably need to/should change all NAs in End_date to 2023-10-31 and Quarter to 2023.3

My table currently looks like this:

ID      Start_date       End_date         Status    Quarter_Start     Quarter_End 
1       2019-03-28       2020-03-26       A         2019.1             2020.1      
2       2011-02-28       2011-04-12       C         2011.1             2011.2       
3       2019-03-28       NA               F         2019.1             NA
3       2005-02-28       2007-06-20       A         2005.1             2020.2       

And I want it to look like this:

ID      Start_date       End_date         Status    Quarter           
1       2019-03-28       2020-03-26       A         2019.1
1       2019-03-28       2020-03-26       A         2019.2
1       2019-03-28       2020-03-26       A         2019.3
1       2019-03-28       2020-03-26       A         2019.4
1       2019-03-28       2020-03-26       A         2020.1
             
2       2011-02-28       2011-04-12       C         2011.1 
2       2011-02-28       2011-04-12       C         2011.2  
           
3       2023-03-28       2023-10-31       F         2023.1
3       2023-03-28       2023-10-31       F         2023.2 
3       2023-03-28       2023-10-31       F         2023.3 
            
3       2005-02-28       2007-06-20       A         2005.1
3       2005-02-28       2007-06-20       A         2005.2
3       2005-02-28       2007-06-20       A         2005.3
3       2005-02-28       2007-06-20       A         2005.4
3       2005-02-28       2007-06-20       A         2006.1
3       2005-02-28       2007-06-20       A         2006.2 
3       2005-02-28       2007-06-20       A         2006.3 
3       2005-02-28       2007-06-20       A         2006.4 
3       2005-02-28       2007-06-20       A         2007.1 
3       2005-02-28       2007-06-20       A         2007.2 

I've tried most of the options on this post Expand rows by date range using start and end date, and none have worked for me so far.

1

There are 1 best solutions below

0
zephryl On BEST ANSWER

You can make a list column with the sequence of quarters for each pair of start and end dates, then tidyr::unnest_longer():

library(lubridate)
library(dplyr)
library(tidyr)
library(purrr)

dat %>%
  mutate(  
    End_date = replace_na(End_date, ymd("2023-10-31")),
    Quarter = map2(
      floor_date(Start_date, unit = "quarter"), 
      End_date, 
      \(start, end) quarter(seq(start, end, by = "quarter"), type = "year.quarter")
    )
  ) %>%
  unnest_longer(Quarter) %>%
  select(!Quarter_Start:Quarter_End)
# A tibble: 37 × 5
      ID Start_date End_date   Status Quarter
   <int> <date>     <date>     <chr>    <dbl>
 1     1 2019-03-28 2020-03-26 A       2019.1
 2     1 2019-03-28 2020-03-26 A       2019.2
 3     1 2019-03-28 2020-03-26 A       2019.3
 4     1 2019-03-28 2020-03-26 A       2019.4
 5     1 2019-03-28 2020-03-26 A       2020.1
 6     2 2011-02-28 2011-04-12 C       2011.1
 7     2 2011-02-28 2011-04-12 C       2011.2
 8     3 2019-03-28 2023-10-31 F       2019.1
 9     3 2019-03-28 2023-10-31 F       2019.2
10     3 2019-03-28 2023-10-31 F       2019.3
# ℹ 27 more rows