Show Menu

CodingJinxx Pandas FAQ Cheat Sheet by

Frequently Asked Questions for Pandas

List Compre­hension

List compre­hension offers a shorter syntax when you want to create a new list based on the values of an existing list.


Based on a list of fruits, you want a new list, containing only the fruits with the letter "­a" in the name.

Without list compre­hension you will have to write a for statement with a condit­ional test inside:

fruits = ["ap­ple­", "­ban­ana­", "­che­rry­", "­kiw­i", "­man­go"]

newlist = []

for x in fruits:

  if "­a" in x:



With list compre­hension you can do all that with only one line of code:

fruits = ["ap­ple­", "­ban­ana­", "­che­rry­", "­kiw­i", "­man­go"]

newlist = [x for x in fruits if "­a" in x]



In statis­tics, imputation is the process of replacing missing data with substi­tuted values.

When substi­tuting for a data point, it is known as "unit imputa­tio­n";
when substi­tuting for a component of a data point, it is known as "item imputa­tio­n".

Pandas Imputation Article

Aggregate Functions

Sums each value of an object
Returns total Count
Returns mathem­atical median
quanti­le(­[0.25, 0.75])
Quantiles of an object
Lowest value in an object
Highest Value in an Object
Returns mathem­atical mean
Returns mathem­atical variance
Returns standard deviation
Groups data by value of specified column (Similar to SQL))
pd.mer­ge(adf, bdf, how='l­eft', on'col')
Merges to Datasheets into one based on a common column
Aggregate Functions are a way of summar­izing or reshaping data

Shape of a Dataframe

Return a tuple repres­enting the dimens­ion­ality of the DataFrame.

>>> df = pd.Dat­aFr­ame­({'­col1': [1, 2], 'col2': [3, 4]})

>>> df.shape

(2, 2)


Return the mean of the values over the requested axis.

DataFr­­an(­axi­s=None, skipna­=None, level=­None, numeri­c_o­nly­=None)


Sorts all values in dataframe and returns the middle value

DataFr­­dia­n(a­xis­=None, skipna­=None, level=­None, numeri­c_o­nly­=None)

Creating a Dataframe from Scratch

# Import pandas library
import pandas as pd
# initialize list of lists
data = [['tom', 10], ['nick', 15], ['juli', 14]]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Age'])
From Scratch means creating the Data by hand

Catego­rical Variable

Is data that is limited to set or range of values

They are best visualised using bar plots or balloon plot

Example Article

Quartiles vs Quantiles

Quartiles 25th percen­tiles of Data

Where as Quantiles can be custom percen­tiles


Correl­ation describes the relati­onship between data.

If the square footage in an apartment increases, the price of the apartment increases aswell


A Scatte­rplot plots data on an x-y grid


A histogram plots data on a axis with the count being repres­ented in height


No comments yet. Add yours below!

Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          Related Cheat Sheets

          More Cheat Sheets by CodingJinxx

          CodingJinxx Pandas Facts Cheat Sheet