Preliminaries.
Write the squared length of a vector x as
.
The least squares estimator
solves the problem:
Which linear combination of the covariates is ``closest'' to the observed data, Y?
Define the squared length of the residual vector as the
residual sum of squares (RSS), and
as RSS/(n-p).
Define an
matrix to be orthogonal if Q satisfies (for any vector)
Orthogonal transformations preserve length.
We can do a transformation on a regression problem:
becomes
Due to the length preservation of Q and the fact that we are doing a least squares problem means that a solution to one is a solution to the other.
Can we choose Q to make the problem simpler and more stable?
Choose Q so that the upper
block of
is upper triangular and the lower
block is all zeroes.
Now
and only the first part depends on
, giving
which can be solved recursively without inversion.