Python - K-Nearest_Neighbors(KNN) model Cheat Sheet by aggialavura - Download free from Cheatography - Cheatography.com: Cheat Sheets For Every Occasion

TO START

# IMPORT DATA LIBRARIES
import numpy as np
import pandas as pd

# IMPORT VIS LIBRARIES
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

# IMPORT MODELLING LIBRARIES
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report,confusion_matrix

PRELIMINARY OPERATIONS

df = pd.read_csv('data.csv')	read data
STANDARDISE THE VARIABLES
scaler = StandardScaler()
scaler.fit(df.drop('y',axis=1))
scaled_feat = scaler.transform(df.drop('y',axis=1))
df_new=pd.DataFrame(scaled_feat,columns=df.columns[:-1])*

df.columns[:-1]: means take all the columns but the last one.

TRAIN MODEL

CREATE X and y
X = df[['col1','col2',etc.]]	create df features
y = df['col']	create df var to predict
SPLIT DATASET
X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.3)	split df in train and test df
FIT THE MODEL
knn = KNeighborsClassifier(n_neighbors=1)*
knn.fit(X_train,y_train)	train/fit the model
MAKE PREDICTIONS
pred = knn.predict(X_test)	make predictions

n_neighbors=1: we start specifying K = 1 and then we see how to better choose the K value (see evaluate block in this cheat sheet).

EVALUATION of the MODEL

EVAUATE MODEL
print(confusion_matrix(y_test,pred))
print(classification_report(y_test,pred))
CHOOSING BETTER K
error_rate = []*	create an empty list
for i in range(1,40): knn = KNeighborsClassifier(n_neighbors=i) knn.fit(X_train,y_train) pred_i = knn.predict(X_test) error_rate.append(np.mean(pred_i != y_test))
ELBOW PLOT
plt.figure(figsize=(10,6)) plt.plot(range(1,40),error_rate) plt.title('Error Rate vs. K Value') plt.xlabel('K') plt.ylabel('Error Rate')
Now we choose the K value where the error starts to reduce and flatten and we repeat the model fitting and evaluation! Theoretically, you should obtain better results.

Explanation:
1. we create an empty list.
2. we loop for a certain range of possible K values, here 1 to 40.
3. we create and fit the KNN model with these different K values.
4. we predict the using these models
5. we calculate the mean of the error of all these models and store the errors in the empty list of point 1. We will then plot these errors to see what K values could be the best one.

Download the Python - K-Nearest_Neighbors(KNN) model Cheat Sheet

1 Page

Latest Cheat Sheet

7 Pages

(0)

Python Beginner to Advanced Cheat Sheet

A detailed Python cheat sheet covering beginner to advanced topics. Python is a popular programming language that can be used on a server to create web applications and this cheat sheet will cover all essential concepts.

musmankkh

3 Aug 25

python, programming, flask, leetcode, w3school, hackerrank

Recent Cheat Sheet Activity

Python - K-Nearest_Neighbors(KNN) model Cheat Sheet (DRAFT) by aggialavura

TO START

PRELIMINARY OPERATIONS

TRAIN MODEL

EVALUATION of the MODEL

Latest Cheat Sheet

Random Cheat Sheet

About Cheatography

Behind the Scenes

Recent Cheat Sheet Activity

Please Disable Your Ad Blocker

Python - K-Nearest_Neighbors(KNN) model Cheat Sheet (DRAFT) by aggialavura

TO START

PRELIM­INARY OPERATIONS

TRAIN MODEL

EVALUATION of the MODEL

Latest Cheat Sheet

Random Cheat Sheet

About Cheatography

Behind the Scenes

Recent Cheat Sheet Activity

Please Disable Your Ad Blocker

PRELIMINARY OPERATIONS