How to perform piecewise linear mixed regression with multiple breakpoints in R?

224 Views Asked by At

I am fitting a piecewise linear mixed regression in R. I know that I can use lme from the nlme package followed by segmented to perform piecewise linear mixed regression. However, upon reading the documentation of the package segmented, I noticed that segmented.lme could only handle 1 breakpoint, in which I have two (at day 30 and day 90).

For a background, I want to model the mileage of cars (macars) at day 0, 30, 90, and 180 (variable days), with age as a confounder. Note that this model is illustrative and not the real data.

This is my prototype code which uses the lme package, before I get confused after reading that segmented.lme can only handle 1 breakpoint:

fit <- lme(macars ~ days + age, random = ~ days | id, data = df)
summary(fit)
pw.fit <- segmented(fit, seg.Z = ~ days, psi = list(days = c(30, 90)),  random = list(id = pdDiag(~1 + days))
summary(pw.fit)

Any help would be greatly appreciated. Thank you very much

1

There are 1 best solutions below

7
Ben Bolker On BEST ANSWER

If you know where the breakpoints are, as suggested by @user255430, you can construct a linear B-spline basis (or more specifically ask the model function to construct one for you) via bs(days, knots = c(30, 90), degree = 1).

However, this isn't parameterized the way you think it is.

library(splines)
days <- 1:200
X <- bs(days, knots = c(30, 90), degree = 1)
par(las = 1, bty = "l") ## cosmetic
matplot(X, type = "l")

enter image description here

As you can see, the last section (days > 90) is predicted by the sum of the second (red dashed) and third (green dotted) components, not just the third component.

You can use a truncated power basis spline instead:

library(cplm)q
## set k=3 to suppress warning
p <- tp(days, knots = c(30, 90), k = 3, degree = 1)
X <- cbind(p$X, p$Z)
matplot(X, type = "l")

enter image description here

However, this is somewhat inconvenient; for only two knots, you can include the components in your model via

~ ... days + I(days*(days>30)) + I(days*(days>90))

In general you should at least consider including all of these terms in your random effects components as well.

You can use emmeans::emmeans to predict the 'intercept' (predicted value) at different time points, and emmeans::emtrends to predict the slope at those time points.