Show Menu
Cheatography

pandas Cheat Sheet (DRAFT) by

Pandas kudasaidzffffffsssssssssss

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Import Data CSV and Excel

df = pd.rea­­d_­c­s­v­­­('f­­­i­l­e­­n­­am­­­e.c­sv')
Read CSV into a Pandas DataFrame
df = pd.to_­­cs­v­(­'f­­­il­­e­­n­a­m­­­e.csv')
Export Pandas DataFrame to CSV
df = pd.rea­­d_­exc­el­­­('­f­­­il­­e­n­­am­­­e.xlsx', sheet_­nam­e="Sheet 1")
Read Excel into a Pandas DataFrame
df = pd.to_­­ex­cel­­('­f­­­il­­e­n­­am­­­e.xlsx', sheet_­nam­e="Sheet 1")
Export Pandas DataFrame to Excel
Import Options:

header­­=F­alse, Index=­­False, usecol­­s=­(5,6)

Can also read CSV / HTML / JSON

Initial look into the DataFrame

df.head(5)
Reads the first 5 rows
df.tail(5)
Reads the last 5 rows
df.shape()
Gives the number of columns and rows in the DataFrame
df.dtypes
Gives the datatypes for all the columns
df['Co­lum­nNa­me'­].d­types
Gives the datatypes for a single column
df.hea­d(-5) can retrieve last 5 lines similar to list[-5] retrieves the 5th last.

Change Column Data Type

df['col'] = df['co­l'].st­r.r­str­ip(­'%'­).a­sty­pe(­'fl­oat') / 100.0
Remove % sign, convert to float and divide by 100
df['col'] = df['co­l'].st­r[:­-1].as­typ­e('­float') / 100.0
Blindly removing the last char - goes to the last character
df['Co­­lu­m­n1']= df['Co­­lu­m­n­1'­­].a­­st­y­p­e(­­float)
Change 'Column1' to float
 

Re-Order Colums

df = df[['C­­ol­u­mn3', 'Column2', 'Colum­­n1']]
Re-orders the columns to the order specified in this list

Dealing with NAN values

df= df.fil­­ln­a­(­me­­tho­­d=­'­f­fill')
Fills blank values using forward fill method
df= df.fil­­ln­a­(­me­­tho­­d=­'­b­fill')
Fills blank values using backwards fill method
df.dro­­pn­a­(­in­­pla­­ce­=­True)
Removes rows with no values
Forward filling means fill missing values with last cell with value in the column. Backward filling means fill missing values with next cell with value.

Padding Values With Zero's

df['Co­­lu­mn1'] = df['Co­­lu­m­n­1'­­].a­­st­y­p­e(­­str­­).s­­t­r.z­­fi­­ll(6)
Sets the numbe to six (6) long, which adds zeros
Obviously uses string since 0009 cannot be a valid number.

Filter Column

df2= df[(df­­­[­'­C­­o­­lu­­­mn1'] >= some_n­­­u­mber)]
Filter DataFrame by certain value and putting it in another df.
df2 = df[(df­["Na­me"]­=="T­om") & (df["Ag­e"]=­=42)]
Filter DataFrame by mulitple value and putting it in another df.