Calculating values of preceding N days

35 Views Asked by At

I have following panel data, with dates, identifier codes, and returns. There are missing dates (non-trading days), but i've omitted that in the example data for simplicity.

DATE ASSET_CODE RET
1990-01-01 1 0.12
1990-01-02 1 0.03
1990-01-03 1 -0.11
1990-01-04 1 0.15
1990-01-05 1 -0.10
1990-01-06 1 0.04
1990-01-07 1 0.10
1990-01-08 1 0.13
1990-01-09 1 -0.12
1990-01-10 1 0.15
1990-01-11 1 -0.08
1990-01-12 1 0.02
1990-01-13 1 0.14
1990-01-14 1 -0.06
1990-01-15 1 0.01
1990-01-16 1 -0.14
1990-01-17 1 0.06
1990-01-18 1 -0.14
1990-01-19 1 -0.13
1990-01-20 1 -0.05
1990-01-21 1 -0.13
1990-01-22 1 0.03
1990-01-23 1 0.07
1990-01-24 1 -0.01
1990-01-25 1 -0.07
1990-01-26 1 0.15
1990-01-27 1 -0.12
1990-01-28 1 0.05
1990-01-29 1 0.15
1990-01-30 1 -0.12
1990-01-31 1 0.05
1990-02-01 1 -0.08
1990-02-02 1 0.08
1990-02-03 1 -0.06
1990-02-04 1 -0.06
1990-02-05 1 -0.07
1990-01-01 2 NA
1990-01-02 2 NA
1990-01-03 2 NA
1990-01-04 2 NA
1990-01-05 2 NA
1990-01-06 2 NA
1990-01-07 2 NA
1990-01-08 2 NA
1990-01-09 2 NA
1990-01-10 2 NA
1990-01-11 2 NA
1990-01-12 2 NA
1990-01-13 2 NA
1990-01-14 2 NA
1990-01-15 2 NA
1990-01-16 2 NA
1990-01-17 2 -0.17
1990-01-18 2 0.17
1990-01-19 2 -0.20
1990-01-20 2 -0.23
1990-01-21 2 0.00
1990-01-22 2 0.24
1990-01-23 2 -0.16
1990-01-24 2 0.13
1990-01-25 2 0.18
1990-01-26 2 -0.10
1990-01-27 2 -0.26
1990-01-28 2 0.18
1990-01-29 2 0.27
1990-01-30 2 0.27
1990-01-31 2 0.09
1990-02-01 2 -0.16
1990-02-02 2 -0.16
1990-02-03 2 -0.13
1990-02-04 2 0.14
1990-02-05 2 0.18

I want to calculate the skewness of the preceding 15-30 values in RET for each ASSET_CODE.

In other words, for asset 1 here, skewness can start to be calculated on 1990-01-16, as 15 preceding values are available. Then, on 1990-01-17, the last 16 values in RET are taken into account etc etc.

Beginning from 1990-01-31, the skewness is calculated based on the preceding 30 values of RET. So on 1990-02-05 for example, skewness is based on returns from 1990-01-06 to 1990-02-04

1

There are 1 best solutions below

0
G. Grothendieck On

Define Skewness which calculates the skewness if its argument has at least 15 non-NA's. list(-seq(30)) means use offsets -1 to -30 for each window and partial=TRUE means if there are not 30 rows available use however many there are.

library(dplyr)
library(e1071)
library(zoo)

Skewness <- function(x) {
  x <- na.omit(x)
  if (length(x) < 15) NA else skewness(x)
}

DF %>%
    mutate(roll = rollapply(RET, list(-seq(30)), Skewness, 
      fill = NA, partial = TRUE), .by = ASSET_CODE)

Note

DF <-
structure(list(DATE = c("1990-01-01", "1990-01-02", "1990-01-03", 
"1990-01-04", "1990-01-05", "1990-01-06", "1990-01-07", "1990-01-08", 
"1990-01-09", "1990-01-10", "1990-01-11", "1990-01-12", "1990-01-13", 
"1990-01-14", "1990-01-15", "1990-01-16", "1990-01-17", "1990-01-18", 
"1990-01-19", "1990-01-20", "1990-01-21", "1990-01-22", "1990-01-23", 
"1990-01-24", "1990-01-25", "1990-01-26", "1990-01-27", "1990-01-28", 
"1990-01-29", "1990-01-30", "1990-01-31", "1990-02-01", "1990-02-02", 
"1990-02-03", "1990-02-04", "1990-02-05", "1990-01-01", "1990-01-02", 
"1990-01-03", "1990-01-04", "1990-01-05", "1990-01-06", "1990-01-07", 
"1990-01-08", "1990-01-09", "1990-01-10", "1990-01-11", "1990-01-12", 
"1990-01-13", "1990-01-14", "1990-01-15", "1990-01-16", "1990-01-17", 
"1990-01-18", "1990-01-19", "1990-01-20", "1990-01-21", "1990-01-22", 
"1990-01-23", "1990-01-24", "1990-01-25", "1990-01-26", "1990-01-27", 
"1990-01-28", "1990-01-29", "1990-01-30", "1990-01-31", "1990-02-01", 
"1990-02-02", "1990-02-03", "1990-02-04", "1990-02-05"), ASSET_CODE = c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L), RET = c(0.12, 0.03, -0.11, 0.15, 
-0.1, 0.04, 0.1, 0.13, -0.12, 0.15, -0.08, 0.02, 0.14, -0.06, 
0.01, -0.14, 0.06, -0.14, -0.13, -0.05, -0.13, 0.03, 0.07, -0.01, 
-0.07, 0.15, -0.12, 0.05, 0.15, -0.12, 0.05, -0.08, 0.08, -0.06, 
-0.06, -0.07, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, -0.17, 0.17, -0.2, -0.23, 0, 0.24, -0.16, 0.13, 
0.18, -0.1, -0.26, 0.18, 0.27, 0.27, 0.09, -0.16, -0.16, -0.13, 
0.14, 0.18)), class = "data.frame", row.names = c(NA, -72L))