Fast lagged dependent variables weighted least squares model on large datasets?

48 Views Asked by At

I am looking to fit an autoregressive AR(p) model to 525600 observations, with weights. I want both the fitted values and the coefficients (including the intercept). I am able to do it, but I was wondering if there is a faster, better way.

Currently, I make the design matrix myself with matfun() from the time series RW. Reprex below:

order <- 10
ndat <- 525600
RW <- arima.sim(model= list(order = c(0, 1, 0)), n=ndat)[2:(ndat+1)]


matfun <- function(RW, order){
  RW <- as.matrix(RW)
  n <- nrow(RW)
  dmat <- matrix(1, n, order)
  for (j in 1:order) {
    dmat[, j] <- c(rep(0,j), RW[1:(n-j)])
  }
  return(dmat[(order+1):n,])
}

matt <- matfun(RW, order = order)
w <- runif(nrow(matt), 0, 1)


mod <- lm(RW[(order+1):ndat]~matt,weights=w)

With larger AR orders, both the design matrix and linear model step become slow. I tried the ar() function but it doesn't let me add weights, and I think the lack of grouping prevented gls() from nlme being helpful (and I couldn't figure out precalculated weights). I do know of the faster lm functions, but maybe there's an even better way? If there isn't a function that can do all this, do you see any way I could speed up any of the design matrix or lm step up?

TIA! :)

EDIT: It seems my question is poorly posed!! Thanks to Ben Bolker for commenting:

You should be aware that including lagged variables in a regression is not the same as fitting an AR(p) model ...

I am definitely looking to include lagged variables as I have done above, not fit an AR(p) model as I stated.

0

There are 0 best solutions below