dplyr mutate sequence by n levels of grouping by variable

85 Views Asked by At

So, my data looks like the following:

Period Date
1 01-01-2020
1 02-01-2020
1 03-01-2020
2 04-01-2020
2 05-01-2020
2 06-01-2020
3 07-01-2020
3 08-01-2020
3 09-01-2020
4 10-01-2020
4 11-01-2020
4 12-01-2020

The period variable is always a sequence of 3 numbers. I'm trying to make a new column that takes the last Date value from n periods and repeats it, i.e. imagine that I want do build this variable accounting 2 periods:

Period New_var
1 06-01-2020
1 06-01-2020
1 06-01-2020
2 06-01-2020
2 06-01-2020
2 06-01-2020
3 12-01-2020
3 12-01-2020
3 12-01-2020
4 12-01-2020
4 12-01-2020
4 12-01-2020

This variable would take the last Date observation and repeated across 2 periods in this case. Thank you

I've tried grouping by Period and the mutate the variable as:

rep(last(Date),each=3*2, length.out = n())

This doesn't work, tho.

1

There are 1 best solutions below

2
Harrison Jones On BEST ANSWER

Here is a full solution using a function and with the data you provided:


library(tidyverse)

# importing the data you provided
date_df <- read_table(
  "Period   Date
1   01-01-2020
1   02-01-2020
1   03-01-2020
2   04-01-2020
2   05-01-2020
2   06-01-2020
3   07-01-2020
3   08-01-2020
3   09-01-2020
4   10-01-2020
4   11-01-2020
4   12-01-2020"
) %>%
  mutate(Date = as.Date(Date, format = "%d-%m-%Y")) # important that Date variable has the Date class

# function so that n can be dynamic
group_dates <- function(date_df, n) {
  
  date_df %>%
    mutate(Period_cut = cut_interval(Period, length = n)) %>%
    group_by(Period_cut) %>%
    mutate(New_var = max(Date)) %>%
    ungroup() %>%
    select(-Date, -Period_cut)
  
}

# run with n = 2
group_dates(date_df, 2)