QR Factorization for Linear Regression Line in R (using qr.solve)

583 Views Asked by At

So I'm trying to figure out how to use the command qr.solve in R to solve for the components of a best fit line (slope and intercept) in accordance to a particular data set. Although I already know how to do this using the lm function, to my understanding, the qr factorization of a matrix could yield the same thing. Would someone be able to explain how such a thing is true, and maybe its benefits over a simple linear model command? Could the inputs of the qr.solve function be implements in R, or would I have to solve it myself first and plug it into R afterwards?

I tried inputting my data set (2 columns each of whose rows represent points on a graph) into a matrix and using that as an x argument in the function qr.solve. However, I'm not too sure what to plug in for b.

2

There are 2 best solutions below

4
Vons On BEST ANSWER

QR decomposition can solve Ax = b. Thus when the least squares solution gets to the point: XTy = XTXB we can use A = XTX and b = XTy to solve for x = B.

?qr.solve

mtcars
lm(mpg ~ cyl, data=mtcars)
with(mtcars, {
  X <<- cbind(1, cyl)
  y <- mpg
  A = t(X) %*% X
  b = t(X) %*% y
  qr.solve(A, b)
})
0
Sandipan Dey On

Just elaborating a little more,

Linear regression aims to solve the least squares problem with the normal equation:

β = inv(X'X)(X'y) = solve(X'X, X'y)

Here, X represents the matrix with the independent predictor variables and y is the dependent response variable, β being the coefficients to be estimated. Also, X' denotes the transpose of matrix X.

With QR decomposition, X is decomposed into an orthonormal matrix Q and an upper triangular matrix R as follows:

X = QR, so that Q'Q = I

Then β can be estimated as shown below:

β = inv(X'X)(X'y) = inv(R'Q'QR)(R'Q'y) = inv(R'R)(R'Q'y) = inv(R)inv(R')(R')(Q'y)

i.e.,

β = inv(R)(Q'y) = solve(R, Q'y)

Above shows how QR decomposition makes linear regression much cheaper, since R is an upper-triangular matrix, the linear system can be solved much more efficiently.

Now, let's compare the performances of different approaches to solve the linear system. with some randomly generated data:

m <- 100 # number of data points
n <- 6   # number of predictor variables
set.seed(1)
X <- matrix(rnorm(m*n), nrow=m)
beta <- matrix(sample(1:(n+1),n+1), ncol=1)
y <- X1%*% beta + rnorm(m)
X1 <- cbind(rep(1,m), X) # add intercept term

First let's confirm that all the following methods produces the same output (the same coefficients are estimated):

lm(y~X)$coeff   
# (Intercept)          X1          X2          X3          X4          X5          X6 
#   6.5233774  -0.2827844   0.3469484  -0.9083762   0.4903771  -0.3217761  -0.0326980 

as.numeric(solve(t(X1)%*%X1, t(X1)%*%y))
# [1]  6.5233774 -0.2827844  0.3469484 -0.9083762  0.4903771 -0.3217761 -0.0326980

q <- qr(X1)
as.numeric(solve(qr.R(q), t(qr.Q(q))%*%y))
# [1]  6.5233774 -0.2827844  0.3469484 -0.9083762  0.4903771 -0.3217761 -0.0326980

as.numeric(qr.solve(X1, y))
# [1]  6.5233774 -0.2827844  0.3469484 -0.9083762  0.4903771 -0.3217761 -0.0326980

As expected, we obtained the exact same value of the coefficients in the above approoaches.

Now, let's compare runtime for solving the linear system with normal equation and using QR decomposition (assuming that Q and R already precomputed).

library(microbenchmark)
library(ggplot2)
q <- qr(X1)
Q <- qr.Q(q)
R <- qr.R(q)
autoplot(
  microbenchmark(solve(t(X1)%*%X1, t(X1)%*%y),
                 solve(R, t(Q)%*%y), times=1000L)
) 

As expected again, the solution can be obtained much faster with matrices obtained from QR decomposition.

enter image description here