Show Menu

Multiple Variable Linear Regression using Scikit Cheat Sheet (DRAFT) by

You can implement multiple linear regression following the same steps as you would for simple regression.

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Step 1: Import packages and classes

# Import pandas and numpy, 
# then LinearRegression available inside linear_model of sklearn and 
# pyplot from matplotlib to visualize

import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt

Step 2 : Import and Prepare data

df = pd.read_csv("sid's File.csv") 
df                                                  #to see first five row of datset
Check for missing values, if any, then for numerical values fill it with mean or median values as for texts take common one or fill according to any other feature column.

Step 3: Visualize

%matplotlib inline                 #If using notebook IDE

plt.xlabel('Name of feature/column you want in x axis')
plt.ylabel('Name of feature/column you want in y axis')
plt.scatter(df.column for x axis, df.column for y axis, color='red' , marker='+')

# Follow this step in case you want to see relationship between features more clearly.