Show Menu
Cheatography

Foundation of Statistics with Michael Cronin Ch 1 Cheat Sheet (DRAFT) by

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Simple Linear Regression

Regression
Studies the relati­onship between quanti­tative variables.
Simple Linear Regression
Only considers 2 variables
Response Variable
Usually denoted Y. We attempt to predict this.
Predictor Variable
Usually denoted X. We use this to predict Y.
(x
i
,y
i
)
The values for X and Y at case i. We usually denote n to be the number of cases.
 
Outline of Simple Linear Regression
Assume a linear relati­onship between X and Y: Y = β
0
1
β
0
The intercept ie. the value of Y when X=0, ie where the line crosses the Y axis.
β
1
The slope. The change in Y for a single unit change in X.
 
We estimate β
0
and β
1
from the data and use the model to predict Y for any given X.
 
Methods of Linear Regression
Scatter Plot
Put all points on a scatter plot and gauge visually whether or not the relati­onship looks linear.
Line of Closest Fit
If the relati­onship looks linear then we find the line fo closest fit and use it to estimate β
0
and β
1

Co-Var­iance and Indepe­ndent Variables

Indepe­ndent Events
P(A|B)­=P(A)
Indepe­ndent Discrete Variables
P(X = x and Y=y) = P(X=x)­P(Y=y)
Indepe­ndent Continuous Variables
The joint pdf of X and Y = h(x,y) = fx(x)gy(y) - the product of individual pdfs.
Covariance
"the mean value of the product of the deviations of two variates from their respective means"
Covariance of X and Y = cov(X,Y) = E(X - μ
1
)(Y - μ
2
) where μ
1
=E(X) and μ
2
=E(Y)
Covariance of indepe­ndent variables
cov(X,Y)=0
Covariance as defined by the book
Measures the associ­ation between X and Y, the extent to which they vary together.
If large X occurs with large Y and small x with small y, there is a positive associ­ation ie. cov(X,Y) > 0.
If large X occurs with small y and Large Y occurs with small x, there is a negative associ­ati­onie. cov(X,Y) < 0.
Direction of associ­ation
+ indicates postive direction, - indicates negative direction.

Least Squares Criterion

Intro
In a scatter plot there could be many potential lines that could fit the data. We use the Least Square­sCr­iterion to select the best line.
e
i
(error)
THe difference between what the line says the value should be and what it actually is.
e
i
(residual)
Difference between the fitted line and actual reality
Residual Sum of Squares (RSS)
We chose β
0
and β
1
so as to minimize RSS.
^ above a letter indicates we are using an estimator

Least Sum of Squares Important Formula

RSS=mi­n(­\sum­_{i=1}n\hat{e}_i2)\\Th­rough\ Partial\ differ­ent­iation\ we\ derive\ the\ estima­tor­s...\­\\ha­t{β­}_{­1}=­\fr­ac{­SXY­}{S­XX}­\\­\hat­{β}­_{0­}=­\bar­{y}­-\h­at{­β}_­{1}­\bar{x}
 

Errors

Real data almost never falls in a perfectly straight line. ie. Real data rarely has a perfectly linear relati­onship. As such real data has errors which could be...
- Measur­ement Errors: Continuous Variables cannot be measured with 100% accuracy.
- An effect of variables not included in the model
- Natural variab­ility.

We should incorp­orate them into our simple linear regression models. eg.

y
i
= β
0
+ β
1
x
i
+ e
i
where e
i
is the error on the ith case


and

y
i
= β
0
+ β
1
x
i
is the true regression line

*

Assump­tions about errors:
We make these assump­tions as we need them to...
- prove the optimaity ofthe estimates for β
0
and β
1

- prove the confidence intervals for β
0
and β
1


e
i
~ NID(0,σ2)
- N: Normally distri­buted with mean 0
- I: Indepe­ndent variables
- D: Distri­buted.
- σ2: Common Variance.
- "e
i
is normally distri­buted with mean 0 and common variance of σ2"

These assumption can also be expressed in terms of "­Co-­Var­ian­ce"
E(e
i
) = 0, var(e
i
) = σ2, cov(e
i
,e
j
) = 0, for i ≠ j
- "­Exp­ected value e
i
is 0, variance is σ2, covariance of e
i
and e
j
is 0 where i is not j"

Combined with the normality assump­tions, this implies e
i
s are indepe­ndent.

Assump­tions must be verified when applying to a regression model.

Sample Correl­ation Coeffi­cient rxy

r
xy
=
SXY/sq­rt(­(SX­X)(­SYY)) = [SXY/(­n-1­]/[­sqr­t((­SXX­/n-­1)(­SYY­/n-1))]
Correl­ation Coeffi­cient
r
xy
is the sample covariance scaled to lie in[-1,1]. ie. -1<=r
xy
<=1
r
xy
>0
Positive associ­ation
r
xy
<0
Negative associ­ation
r
xy
=1
All points lie on positive slope. The closer r
xy
is to 1, the closer all points are to lying on the positive line.
r
xy
=-1
All points lie on negative slope. The closer r
xy
is to -1, the closer all points are to lying on the negative line.
Bivariate Regression
rr and/or its square r2 is used to measure howwell the linear model fits the data.
Multiple Regression
The multiple correl­ation coeffi­cient (R2) is used to measure how well the linear model fits the data.
x-bar/x̅
Indicates the sample mean of x
SXY
The standard deviation of X on Y
Linearity
Linearity cannot be deduuced from correl­ation coeffi­cient. It should be paired with the scatter plot and never be considered in isolation.

The X2 Distri­bution (Chi-S­quared)

Degrees of Freedom (df)
The number of different values­/qu­ant­ities which a distri­bution can be assigned.
X2(v)
A chi-sq­uared distri­bution with v df.
E(X2(v)) = v
ie. The expeced value of a X2 distri­bution with v df, is v.
RSS/σ2 ~ X2(n-2)
So...
E(RSS/σ2) => E(X2(n-2)) = n-2 and so E(RSS/­n-2)=σ2
RSS/n-2
An unbiased estimate of σ2.
sqrt(σ2) = σ
Estimate of Standard Error of Regres­sio­n/R­esidual Standard Error(in R)
sqrt(e­sti­mated variance) =
standard error