Sum of Squared Residuals |
(y − Xβˆ)′(y − Xβˆ) |
|
y′y − βˆ′X′y − y′Xβˆ + βˆ′X′Xβˆ |
|
y′y − 2βˆ′X′y + βˆ′X′Xβˆ |
Minimise the SSR |
∂(SSR)/∂βˆ = −2X′y + 2X′Xβˆ = 0 |
from the minimum we get: "normal equation" |
(X′X)βˆ = X′y |
Solve for OLS estimator βˆ; by pre multiplying both sides by (X′X) |
(X′X)−1(X′X)βˆ = (X′X)−1X′y |
by definition, (X′X)−1(X′X) = I |
Iβˆ = (X′X)−1X′y |
|
βˆ = (X ′ X )−1 X ′ y |
Properties |
The observed values of X are uncorrelated with the residuals. |
X′e = 0 implies that for every column xk of X, x′ke = 0. |
substitute in y = Xβˆ + e into normal equation |
(X′X)βˆ = X′(Xβˆ + e) |
|
(X′X)βˆ = (X′X)βˆ + X′e |
|
X′e = 0 |
The sum of the residuals is zero. |
If there is a constant, then the first column in X (i.e. X1) will be a column of ones. This means that for the first element in the X′e vector (i.e. X11 ×e1 +X12 ×e2 +...+X1n ×en) to be zero, it must be the case that ei = 0. |
The sample mean of the residuals is zero. |
e= ∑e i/n = 0. |
The regression hyperplane passes through the means of the observed values (X and y). |
This follows from the fact that e = 0. Recall that e = y − Xβˆ. Dividing by the number of observations, we get e = y − xβˆ = 0. This implies that y = xβˆ. This shows that the regression hyperplane goes through the point of means of the data. |
The predicted values of y are uncorrelated with the residuals. |
ˆ′e = (Xβˆ)′e = b′X′e = 0 |
The mean of the predicted Y’s for the sample will equal the mean of the observed Y’s : y^-=y- |
The Gauss-Markov Theorem: Proof that βˆ is an unbiased estimator of β |
βˆ = (X′X)−1X′y=(X′X)−1X′(Xβ + ε) |
|
β + (X′X)−1X′ε |
given (X′X)−1X′X = I |
E[βˆ] = E[β] + E[(X′X)−1X′ε] = β + (X′X)−1X′E[ε] |
where E[X′ε]=0 |
E[βˆ]=β |
Proof that βˆ is a linear estimator of β. |
βˆ = β + (X′X)−1X′ε; where (X′X)−1X′= A |
|
βˆ = β + Aε => linear equation |