Show Menu

Econometrics Cheat Sheet (DRAFT) by

This is a draft cheat sheet. It is a work in progress and is not finished yet.

properties of OLS Matrix

Sum of Squared Residuals
(y − Xβˆ)′(y − Xβˆ)
y′y − βˆ′X′y − y′Xβˆ + βˆ′X′Xβˆ
y′y − 2βˆ′X′y + βˆ′X′Xβˆ
Minimise the SSR
∂(SSR)/∂βˆ = −2X′y + 2X′Xβˆ = 0
from the minimum we get: "­normal equati­on"
(X′X)βˆ = X′y
Solve for OLS estimator βˆ; by pre multip­lying both sides by (X′X)
(X′X)−­1(X­′X)βˆ = (X′X)−1X′y
by defini­tion, (X′X)−­1(X′X) = I
Iβˆ = (X′X)−1X′y
βˆ = (X ′ X )−1 X ′ y
The observed values of X are uncorr­elated with the residuals.
X′e = 0 implies that for every column xk of X, x′ke = 0.
substitute in y = Xβˆ + e into normal equation
(X′X)βˆ = X′(Xβˆ + e)
(X′X)βˆ = (X′X)βˆ + X′e
X′e = 0
The sum of the residuals is zero.
If there is a constant, then the first column in X (i.e. X1) will be a column of ones. This means that for the first element in the X′e vector (i.e. X11 ×e1 +X12 ×e2 +...+X1n ×en) to be zero, it must be the case that ei = 0.
The sample mean of the residuals is zero.
e= ∑e i/n = 0.
The regression hyperplane passes through the means of the observed values (X and y).
This follows from the fact that e = 0. Recall that e = y − Xβˆ. Dividing by the number of observ­ations, we get e = y − xβˆ = 0. This implies that y = xβˆ. This shows that the regression hyperplane goes through the point of means of the data.
The predicted values of y are uncorr­elated with the residuals.
ˆ′e = (Xβˆ)′e = b′X′e = 0
The mean of the predicted Y’s for the sample will equal the mean of the observed Y’s : y^-=y-
The Gauss-­Markov Theorem: Proof that βˆ is an unbiased estimator of β
βˆ = (X′X)−1X′y=(X′X)−1X′(Xβ + ε)
β + (X′X)−1X′ε
given (X′X)−1X′X = I
E[βˆ] = E[β] + E[(X′X)−1X′ε] = β + (X′X)−1X′E[ε]
where E[X′ε]=0
Proof that βˆ is a linear estimator of β.
βˆ = β + (X′X)−1X′ε; where (X′X)−1X′= A
βˆ = β + Aε => linear equation


the statistics used to test hypotheses under Gauss-­Markov assump­tions are not valid in the presence of hetros­ked­ast­icity.
Valid estimator (any form)
∑[(x1- x-)2 uˆi2]/[SST2x]
SSTx=∑(x1- x-)2
Robust Standard error


Normality Assump­tion:
zero mean and Variance
Var(u)= σ2
(βˆ j- β j)/se(βˆ j)~ t n-k-1 =t df
H0 : βj = 0
used in testing hypotheses about a single population parameter as in .
Test statistic
t βˆ j=(βˆ j)/se(βˆ j)~ t n-k-1
t = (estimate − hypoth­esised value)/ standard error
Altern­ative Hypoth­esi­s/one sided
H1: βj > 0
t βˆj > c [c @5%]
H1: βj < 0
t βˆj <- c [c @5%]
Two sided
H1: βj =/= 0
|tβˆj | > c [c @2.5%]
If H0, rejected
x j is statis­tically signif­icant, (signi­fic­antly different from zero), @ the 5% level
if H0, not rejected
x j is statis­tically insign­ificant @the 5% level
smallest signif­icant level at which the null hypotheses would be rejected
Confidence Interval
βˆj ±c·se(βˆj)
where c is 97.5 percentile in a t n-k-1 distri­bution
CI given; @ 5% signif­icant level
H0 :βj =aj is rejected against H1:βj = ̸=aj ; if aj is not in the 95% confidence interval
H0:β1<β2 ⇔ β1−β2<0
t= (βˆ1−βˆ2) /se(βˆ1 − βˆ2)
se(βˆ1 − βˆ2) = √Var(βˆ1 − βˆ2)
Var(βˆ1 − βˆ2) = Var(βˆ1) + Var(βˆ2) − 2Cov(βˆ1, βˆ2)
altern­ative to calcul­ating se(βˆ1 − βˆ2)
Let θ = βˆ1 − βˆ2; β1 = θ + βˆ2
H0: θ=0, H1: θ<0
Substi­tuting β1 = θ + βˆ2 into the model we obtain
β0 +θ x1 +β2(x1 +x2)+β3 x3 +u
F Test
F =[(SSRr-SSRur )/q] / [SSRur/(n-k-1)]
=number of restri­ctions
n-k-1= df ur
= df r- df ur
R2 F stat
SSR= SST(1 - R2 )
F= [(R2ur-R2r)/q] / [1-R2ur)/(df ur)]
remember to not square the R value thats already been done
Overall signif­icance of the regression
Testing joint exclusion

Data Scaling

if Xj is * by c
Its coeffi­cient is / by c
If dependant variable is * by c
ALL OLS coeffi­cients are * by c
neither t nor F statistics are affected
Beta coeffi­cients
obtained from an OLS regression after the dependant and indepe­ndent variables have been transf­ormed into z-scores

Dummy Variables

Dummy/Binary Variables
= yes/no variables
= take on the values 0 and 1 to identify the mutually exclusive classes of the explan­atory variables.
= leads to regression models where the parameters have very natural interp­ret­ations
Given: wage= β0+ ∂0 female + β1 edu + u
To solve for ∂0:
where level of education is the same
Graphi­cally ∂0 =
an intercept shift
male intercept= β0
female intercept= β0+∂0
dummy variable trap=
when both dummy variables (male & female) are included; resulting in perfect collin­earity
If a qualitative variable has m levels;
then (m−1) dummy variables are required and each of them takes value 0 and 1.
Hypothesis test
Test whether the two regression models are identical:
H0 :β2 =β3 0
H1 :β2 ≠0 and/or β3 ≠0.
Acceptance of H0 indicates that only single model is necessary to explain the relati­onship.
Test is two models differ with respect to intercepts only and they have same slopes
H0 :β3 =0
H1:β3 ≠0.
Treating a quanti­tative variable as qualit­ative variable increases the complexity of the model.
The degrees of freedom for error are reduced.
Can effect the inferences if data set is small