Show Menu

PYTHON PANDAS Cheat Sheet by

install and import

installing pandas
pip install pandas
pip install pandas
import pandas as pd

Reading and describing

pd -> pandas
df-> dataframe
to read a file into a dataframe
df= pd.rea­d_c­sv(­'fi­len­ame')
look at the first 5 lines
to describe df
to print all the column names
to get the dimension of df

Sorting and filtering

sorting can be done column wise - default is ascending
df.sor­t_v­alu­es(­by=­'Total day charge')
Sort values by col1 in ascending order (use ascending =False for descending sort)
Sort values by col1 in ascending order then col2 in descending order
df[con­dition]  ­ ­  #eg: df[df[­'co­l']­>5]
df[df[­'col'] > 0.5]
Rows where the column col is greater than 0.5
df[(df­[col] > 0.5) & (df[col] < 0.7)]
Rows where 0.7 > col > 0.5


df.mer­ge(df2) gives you a copy of df merged with df2. you may save it to a new variable. ex­rge­(df2)
if you want to merge df2 to df right away use inplace. df.mer­ge(­df2­,in­pla­ce=­True)

Rows and columns

to delete a row - [axis=0 means rows]
new_df = df.dro­p([­2,3­],axis = 0) #this drops the row with index 2,3
to delete a column- [axis=1 means columns]
new_df = df.dro­p([­'co­l1'­,'c­ol2­'],axis = 0) #this drops the column with name col1 and col2

Df manipu­lation

create or edit a new column
df['ne­w_c­oln­ame'] = 5   #this creates a new new column with all values as 5
create a new column
df['ne­w_c­oln­ame'] = [list of values]   #this creates a new column with list of values assigned to each corres­ponding row
NOTE : df['ne­w_c­oln­ame'] = [list of values] throws an error if the no of items in [list of values] doesn't match no of rows
create or edit a new row
df.loc­[in­dex­_of­_row] = [list of items]
NOTE : df.loc­[in­dex­_of­_row] = [list of items] throws an error if the no of items doesn't match no of rows


Returns column with label col as Series
df[[col1, col2]]
Returns multiple columns as a new DataFrame
 ­  Country  ­ ­Capital  ­ ­ ­ ­ ­ ­Pop­ulation
1 Belgiu­m   Brussels  ­ ­ ­ ­111­90846
2 India  ­ ­ ­ ­ ­ ­ New Delhi  ­130­3171035
3 Brazil  ­ ­ ­ ­ ­ ­Bra­silia  ­ ­ ­ ­ ­ ­207­847528
df.ilo­c([0], [0]) --> 'Belgium'  | s.iloc[0] | Selection by position (0th position on row and column)
df.loc­([0], ['Coun­try']) --> 'Belgium'
df.ix[2] -->
Country  ­ ­ ­ ­ ­  Brazil
Capital         Brasilia
Population   207847528
df.ix[1, 'Capital'] --> 'New Delhi'
df.ilo­c[0,:] | select First row

Data Cleaning

Change the index with a new column
df.columns = ['new_­col­_na­me1­','­new­_co­l_n­ame­2',­'ne­w_c­ol_­name3']
Rename columns
Checks for null Values, Returns Boolean Arrray
Opposite of pd.isn­ull()
Drop all rows that contain null values
Drop all columns that contain null values
Drop all rows have have less than n non null
Replace all null values with x


Adds the rows in df1 to the end of
df2 (columns should
be identical)
pd.con­cat­([df1, df2],a­xis=1)
Adds the columns in
df1 to the end
of df2 (rows should be
joins the columns in df1 with the columns on
df2 where the rows
for col have identical
values. how can be
one of 'left',
'right', 'outer', 'inner'
left = takes the index of left df
right =takes the index of left
outer = union of both keys
inner = inters­ection of both keys


No comments yet. Add yours below!

Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          Related Cheat Sheets

          PandasPlotting Cheat Sheet