Regression Cheat Sheet

What is Regression Analysis?

A form of predictive modelling technique which investigates the relationship between a dependent (target) and independent variable(s) (predictor)

1. Shows significant relationships btw. target and predictors

2. Shows strength of impact of multiple predictor on a target

Coefficients

Mean change in the response variable for one unit of change in the predictor variable while holding other predictors in the model constant.

Think of them as slopes

Polynomial Regression (y=a+b*x^2...)

Curve that fits

Higher degree polynomial -> over-fitting risk

Polynomial Regression

Multicollinearity

Multicollinearity test: Variance inflation factors (VIF >= 5)

Increase the variance of the coeff. estimates (makes them sensitive)

Stepwise regression does not work as well

Doesn’t affect the overall fit of the model

Doesn't produce bad predictions

SOLUTIONS

- Standardized predictors

- Removing highly correlated predictors

- Linearly combining predictors (x.e. sum)

- Different analyses: PLS or PCA

Stepwise Regression

Maximize prediction power with minimum number of predictor variables

Fits the regression model by adding/dropping co-variates one at a time

- Standard stepwise regression adds and removes predictors as needed for each step.

- Forward selection starts with most significant predictor and adds variable for each step.

- Backward elimination starts with all predictors and removes the least significant variable for each step.

Regularized Linear Models (Shrinkage)

Regularize linear model through constraining the weights

Regularized term added to cost function. Learning algorithm not only fits data but keeps model weights as small as possible.

Ridge (L2) | Lasso (L1) | ElasticNet (L1 & L2)

Ridge Regression

L1: adds penalty equivalent to squ. of the magnitude of coefficients

Minimization = LS Obj + α * (sum of squ of coefficients)

It shrinks the value of coefficients but doesn’t reaches zero

Lasso Regression

L1: adds penalty equivalent to abs. value of the magnitude of coefficients

Minimization = LS Obj + α * (sum of abs value of coefficients)

LS Obj - Least Squares objective

ElasticNet Regression

Ridge and Lasso: 'r' controls de mix ratio.

r*λ*sum(β²) +(1-r/2)* λ*sum(abs(β))

Regression Types

Techniques are mostly driven by three metrics

Linear Regression (Y=a+b*X + e)

Straight line (regression line)

Least Square Method to best fit line

Linear relationship between predictors and target

CONS

Multicollinearity, autocorrelation, heteroskedasticity

Very sensitive to Outliers

Logistic Regression

Target binary (0/ 1): binomial distribution

Logit function

widely used for classification problems

can handle various types of relationships because it applies a non-linear log transformation to the predicted odds ratio

maximum likelihood estimates

Requires large sample sizes

Ordinal Target -> Ordinal logistic regression

Multiclass Target -> Multinomial Logistic regression.

Logistic Regression

odds= p/ (1-p) #event prob / not event prob
ln(odds) = ln(p/(1-p))
logit(p) = ln(p/(1-p))  = b0+b1X1+b2X2+b3X3....+bkXk

p is the probability of presence of the characteristic of interest.

Regression Cheat Sheet (DRAFT) by isantabarbara

What is Regression Analysis?

Coefficients

Polynomial Regression (y=a+b*x^2...)

Polynomial Regression

Multicollinearity

Stepwise Regression

Regularized Linear Models (Shrinkage)

Ridge Regression

Lasso Regression

ElasticNet Regression

Regression Types

Linear Regression (Y=a+b*X + e)

Logistic Regression

Logistic Regression

Logistic Regression

Latest Cheat Sheet

Random Cheat Sheet

About Cheatography

Behind the Scenes

Recent Cheat Sheet Activity

Please Disable Your Ad Blocker

Regression Cheat Sheet (DRAFT) by isantabarbara

What is Regression Analysis?

Coeffi­cients

Polynomial Regression (y=a+b­*x^­2...)

Polynomial Regression

Multic­oll­ine­arity

Stepwise Regression

Regula­rized Linear Models (Shrin­kage)

Ridge Regression

Lasso Regression

ElasticNet Regression

Regression Types

Linear Regression (Y=a+b*X + e)

Logistic Regression

Logistic Regression

Logistic Regression

Latest Cheat Sheet

Random Cheat Sheet

About Cheatography

Behind the Scenes

Recent Cheat Sheet Activity

Please Disable Your Ad Blocker

Coefficients

Polynomial Regression (y=a+b*x^2...)

Multicollinearity

Regularized Linear Models (Shrinkage)