Show Menu
Cheatography

Python-Dev Cheat Sheet (DRAFT) by

A python cheatsheet for data analysis, automation and web development

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Schedule module

schedu­le.e­ve­ry(­10).se­con­ds.d­o(job)
Every N seconds
schedu­le.e­ve­ry(­5).m­in­ute­s.d­o(job)
Every N minutes
schedu­le.e­ve­ry(­).h­our.do­(job)
Every N hours
schedu­le.e­ve­ry(­).d­ay.a­t(­"­10:­30").do­(job)
Daily at specific time
schedu­le.e­ve­ry(­).m­ond­ay.d­o(job)
Weekly
schedu­le.e­ve­ry(­).f­rid­ay.a­t(­"­18:­00").do­(job)
Weekly at time
schedu­le.e­ve­ry(­3).h­ou­rs.d­o(job)
Custom interval
schedu­le.r­un­_pe­nding()
Run due jobs
schedu­le.c­an­cel­_jo­b(j­ob_­ins­tance)
Cancel a job
schedu­le.e­ve­ry(­).d­ay.a­t(­"­12:­00").do­(greet, name="K­ush­")
Schedule tasks with arguments

Types of errors

NameError
Doesn't recognize the name you are using
TypeError
When you try to combine or manipulate data in a way python doesn't allow
IndexError
The index doesn't exist
KeyError
When you try to access a value in a dictionary using a key that doesn't exist
ZeroDi­vis­ion­Error
When you divide a number by 0
ValueError
Function recieves a correct type but invalid value
Attrib­ute­Error
Invalid attribute or method for an object
Import­Error / Module­Not­Fou­ndError
Failed to import a module
FileNo­tFo­und­Error
File does not exist when trying to open it

Pandas module

df = pd.Dat­aFr­ame­(di­cti­onary)
To convert a dictionary into a pandas dataframe
df = pd.rea­d_c­sv(­'fi­le.c­sv')
To convert a csv file into a dataframe
df = pd.rea­d_e­xce­l('­fil­e.x­lsx')
To convert an excel file into a dataframe
df = pd.rea­d_j­son­('f­ile.json')
To convert a json file into a dataframe
df.to_­csv­('o­utp­ut.c­sv', index=­False)
Convert a dataframe into a csv file
df.to_­exc­el(­'ou­tpu­t.e­xcel')
Convert a dataframe into an excel file
df.head(k)
First k rows, leave empty for five
df.tail(k)
Last k rows, leave empty for five
df.info()
Data types and non-null values
df.des­cribe()
Summary statistics
df.shape
No. of rows and columns
df.columns
Column names
df.dtypes
Data types
df['col']
A specified column
df.iloc[k, l]
A specified cell by index, leave l empty for an entire row
df.loc[k, 'col']
A specified cell by index, 'col' is column name
df[0:5]
Slicing rows
df[df[­'col'] > 25]
Filter data by condition
df[df[­'col'] > 25 & (df['Age'] < 40)]
Filter data by multiple conditions
df[df[­'Na­me'­].i­sin­(['­Ali­ce'])]
Filter by values
df.ren­ame­(co­lum­ns=­{'old': 'new'})
Renaming a column
df.dro­p(c­olu­mns­=['­Col1', 'Col2'])
Dropping columns
df.dro­p(i­nde­x=[0, 1])
Dropping rows
df[col­].sum()
Sum of values in col
df[col­].m­ean()
Mean of values in col
df[col­].v­alu­e_c­ounts()
Number of values in col
df.gro­upb­y(c­ol).mean()
Grouped stats
df.isn­ull()
Returns null values of boolean dataframes
df.isn­ull­().s­um()
No. of null values
df.dro­pna()
Drop the row with null values
df.fil­lna(k)
Fill the missing values with value k
df['col'] = df['co­l'].st­r.s­trip()
Remove whitespace
df['col'] = df['co­l'].st­r.l­ower()
Present data in lowercase
df['col'] = pd.to_­dat­eti­me(­df[­'col'])
Convert to datetime
df.sor­t_v­alu­es(­'Age')
Sort data by age
df.sor­t_v­alu­es(­['Age', 'Name'])
Sort data by multiple values
df.res­et_­ind­ex(­dro­p=True)
Reset index
pd.con­cat­([df1, df2])
Appending rows
pd.mer­ge(df1, df2, on='ID')
Joining data by column value
pd.mer­ge(df1, df2, how='l­eft', on='ID')
Left joining data by column value
df.piv­ot_­tab­le(­ind­ex=­'Ge­nder', values­='Age', aggfun­c='­mean')
Create a pivot table with mean of the values catego­rized by index

MIME module

msg = MIMETe­xt(­'This is a plain text email body', 'plain')
Define message in plain format
msg['S­ubj­ect'] = 'Plain Text Email'
Define subject
msg['F­rom'] = 'sende­r@e­xam­ple.com'
Define sender's address
msg['To'] = 'recip­ien­t@e­xam­ple.com'
Define recipi­ent's address
msg.at­tac­h(c­ontent)
Attaching a content
msg = MIMEMu­lti­par­t('­alt­ern­ative')
Creating both versions of text

Matplotlib module

plt.pl­ot(x, y, color=­'red', linest­yle­='--', marker­='o', label=­'line 1')
Line plot with color red, dashed lines, o marker labeled as 'line 1'
plt.ti­tle­("Ti­tle­")
Set title of the chart
plt.xl­abe­l("x­-ax­is")
Label of x-axis
plt.yl­abe­l("y­-ax­is")
Label of y-axis
plt.le­gend()
Show legend
plt.gr­id(­True)
Show grid
plt.show()
Display the chart
plt.fi­gur­e(f­igs­ize=(6, 4))
Set figure size
plt.su­bpl­ot(2, 1, 1)
2 rows, 1 column, 1st plot
plt.ti­ght­_la­yout()
Avoid overlap
plt.sc­att­er(x, y)
Scatter plot
plt.bar(x, y)
Bar plot
plt.ba­rh(x, y)
Horizontal bar plot
plt.hi­st(­list, bins=5)
Histogram plot
plt.pi­e(d­ata­_list, labels­=la­bel­_list, autopc­t='­%1.1­f%%')
Pie chart plot
plt.st­yle.us­e('­ggp­lot')
Set global chart style
plt.st­yle.av­ailable
Show all chart styles
plt.sa­vef­ig(­'pl­ot.p­df', dpi=300)
Save chart as pdf with resolution
plt.sa­vef­ig(­'pl­ot.p­ng')
Save chart as png
plt.te­xt(2, 20, "­Sample Text")
Add sample text to x=2, y=20
plt.an­not­ate­("Im­por­tan­t", xy=(2, 20), xytext=(3, 25), arrowp­rop­s=d­ict­(fa­cec­olo­r='­bla­ck'))
For annotating
plt.xs­cal­e('­log')
Logari­thmic x-axis
plt.ys­cal­e('­log')
Logari­thmic y-axis
plt.xl­im(0, 5)
X-axis limits
plt.yl­im(0, 5)
Y-axis limits
plt.xt­ick­s([1, 2, 3])
Custom ticks in x-axis
plt.yt­ick­s([1, 2, 3])
Custom ticks in y-axis

Requests module

response = reques­ts.g­et­('h­ttp­s:/­/ap­i.e­xam­ple.co­m/d­ata')
GET request
response = reques­ts.p­os­t('­htt­ps:­//a­pi.e­xa­mpl­e.c­om/­cre­ate', data={­'key': 'value'})
POST request
response = reques­ts.p­ut­('h­ttp­s:/­/ap­i.e­xam­ple.co­m/u­pda­te/1', data={­'key': 'new_v­alue'})
PUT request
response = reques­ts.d­el­ete­('h­ttp­s:/­/ap­i.e­xam­ple.co­m/d­ele­te/1')
DELETE request
respon­se.s­ta­tus­_code
Status Code
respon­se.h­eaders
headers dictionary
respon­se.text
Raw response as text
respon­se.j­son()
Parse response as JSON
reques­ts.g­et­('h­ttp­s:/­/ex­amp­le.c­om', proxie­s=p­roxies)
Request with proxy

Plotly module

import plotly.gr­aph­_ob­jects as go
import plotly.ex­press as px
df = px.dat­a.g­apm­inder()
Returning a Gapminder dataset as a pandas dataframe
px.lin­e(d­f[d­f['­cou­ntry'] == 'India'], x='year', y='gdp­Per­cap', title='GDP over time')
Line plot country dataframe, x=year, y=gdpp­ercap and title is GDP over time
px.bar­(x=­['A', 'B'], y=[10, 20], title='Bar Plot')
Bar plot
px.sca­tte­r(df, x='gdp­Per­cap', y='lif­eExp', color=­'co­nti­nent', title='GDP vs Life Expect­ancy')
Scatter plot
px.sca­tte­r(df, x='gdp­Per­cap', y='lif­eExp', size='­pop', color=­'co­nti­nent', hover_­nam­e='­cou­ntry', log_x=­True)
Bubble sort
px.cho­rop­let­h(d­f[d­f['­yea­r']­==2­007], locati­ons­="is­o_a­lph­a", color=­"­lif­eEx­p", hover_­nam­e="c­oun­try­")
Map plot (Choro­pleth)
fig.up­dat­e_l­ayo­ut(­tit­le='New Title', xaxis_­tit­le='X Axis', yaxis_­tit­le='Y Axis', templa­te=­'pl­otl­y_d­ark')
To customize layout
fig.ad­d_t­rac­e(g­o.S­cat­ter­(x=[1, 2, 3], y=[4, 5, 6], mode='­lin­es+­mar­kers', name='­Line'))
Line plot
fig = go.Fig­ure­(go.Ba­r(x­=['A', 'B'], y=[10, 15]))
Bar plot
go.Fig­ure­(go.Pi­e(l­abe­ls=­['A', 'B'], values­=[30, 70]))
Pie plot
fig.wr­ite­_ht­ml(­"­plo­t.h­tml­")
Save as html file
fig.wr­ite­_im­age­("pl­ot.p­ng­")
Save as image file
fig.up­dat­e_l­ayo­ut(­hov­erm­ode='x unified')
Tooltip follows x
fig.up­dat­e_t­rac­es(­mar­ker­=di­ct(­siz­e=10))
Change marker size
fig.up­dat­e_l­ayo­ut(­dra­gmo­de=­'zoom')
Default zoom tool
fig.up­dat­e_l­ayo­ut(­tem­pla­te=­'pl­otl­y_d­ark')
Update the style of theme
px.sca­tte­r_g­eo(­px.d­at­a.g­apm­ind­er(­).q­uer­y("y­ear­==2­007­"), locati­ons­="is­o_a­lph­a", color=­"­con­tin­ent­", size="p­op")
Map visual­iza­tions
from plotly.su­bplots import make_s­ubplots
fig = make_s­ubp­lot­s(r­ows=1, cols=2)
To set subplots
fig.ad­d_t­rac­e(g­o.S­cat­ter­(x=[1, 2], y=[3, 4]), row=1, col=1)
add trace in a subplot

Random module

random.ra­ndom()
random float between 0.0 and 1.0
random.un­ifo­rm(a, b)
random float between a and b
random.ra­ndi­nt(a, b)
random integer between a and b
random.ra­ndr­ange(0, 10, 2)
random number from [0, 2, 4, 6, 8, 10]
random.ch­oic­e(list)
random element from a list
random.ch­oic­es(­list, weight­s=None, k=2)
k no. of random elements from a list with replac­ement, weights is a list that specifies the probab­ility of choosing a specific element
random.sa­mpl­e(list, k=2)
k no. of unique elemen­ts(no replac­ement)
random.sh­uff­le(­list)
shuffles a list
random.se­ed(­a=None)
use this to get the same result every time

SMTPlib module

server = smtpli­b.S­MTP­('s­mtp.gm­ail.com', 587)
Connect with SMTP server through TLS
server.st­art­tls()
Start TLS connection
server = smtpli­b.S­MTP­_SS­L('­smt­p.g­mai­l.com', 465)
Connect with SMTP server through SSL
server.lo­gin­('y­our­_em­ail­@ex­amp­le.c­om', 'your_­pas­sword')
Login to your account
server.se­ndm­ail­(fr­om_­email, to_email, message)
To send mail from your email
server.quit()
Close the connection (Very Important)

Glob module

glob.g­lob­('*.txt')
All .txt files in current directory
glob.g­lob('*/.txt', recurs­ive­=True)
Match files in subdir­ect­ories
glob.g­lob­('*.txt', recurs­ive­=False, includ­e_h­idd­en=­False)
Sort matched files
Glob module works best when worked with Regex expres­sions
 

Os module

os.get­cwd()
Returns the current working directory
os.chd­ir(­'pa­th/­to/­dir­ect­ory')
Changes current working directory
os.lis­tdi­r('­path')
Lists files and folders in the specified path
os.mkd­ir(­'di­rname')
Creates a single directory
os.mak­edi­rs(­'di­r/s­ubdir')
Creates interm­ediate direct­ories as needed
os.rmd­ir(­'di­rname')
Removes an empty directory
os.rem­ove­dir­s('­dir­/su­bdir')
Removes nested empty direct­ories
os.rem­ove­('f­ile­name')
Removes a file
os.pat­h.e­xis­ts(­'path')
Returns true if path exists
os.pat­h.i­sfi­le(­'path')
True if it's a file
os.pat­h.i­sdi­r('­path')
True if it's a directory
os.pat­h.j­oin­('f­older', 'file.t­xt')
Combines path using the right seperator
os.pat­h.b­ase­nam­e('­pat­h/t­o/f­ile.txt')
Returns file.txt
os.pat­h.d­irn­ame­('p­ath­/to­/fi­le.t­xt')
Returns 'path/to'
os.pat­h.s­pli­t('­pat­h/t­o/f­ile.txt')
Returns ('path­/to', 'file.t­xt')
os.pat­h.a­bsp­ath­('f­ile.txt')
Returns absolute path to the file
os.env­iro­n.g­et(­'HOME')
Retrieves the path to the current user's home directory
os.env­iro­n['­MY_­VAR'] = 'value'
Sets or creates an enviro­nment variable within the current enviro­nment process
os.sys­tem­('ls')
Executes a shell command
os.get­pid()
Current process ID
os.get­ppid()
Current parent process ID
os.ren­ame­('o­ld.t­xt', 'new.txt')
Renaming a file or a directory

IMAPlib module

imap = imapli­b.I­MAP­4_S­SL(­'im­ap.g­ma­il.c­om', 993)
Connect to mail server
imap.l­ogi­n('­you­r_e­mai­l@e­xam­ple.com', 'your_­pas­sword')
Login to your account
imap.l­ist()
List all messages
imap.s­ele­ct(­'IN­BOX')
Select a mailbox
status, messages = imap.s­ear­ch(­None, 'ALL')
Search all emails
status, messages = imap.s­ear­ch(­None, 'UNSEEN')
Search unread emails
status, messages = imap.s­ear­ch(­None, 'FROM', '"se­nde­r@e­xam­ple.co­m"')
Search emails from a specific address
imap.s­tor­e(l­ate­st_­ema­il_id, '+FLAGS', '\\Seen')
Mark emails as read
imap.s­tor­e(l­ate­st_­ema­il_id, '+FLAGS', '\\Del­eted')
Delete all emails
imap.c­lose(), imap.l­ogout()
Close connection (Very important)

Re module

re.sea­rch­(pa­ttern, string)
Searches for first match everywhere
re.mat­ch(­pat­tern, string)
Checks for a match only at the beginning
re.ful­lma­tch­(pa­ttern, str)
Matches entire string to pattern
re.fin­dal­l(p­attern, string)
Returns all non-ov­erl­apping matches
re.fin­dit­er(­pat­tern, string)
Returns iterators yielding match objects
re.sub­(pat, repl, string)
Replace matches with repl
re.spl­it(­pat­tern, string)
Split string by the matches
re.com­pil­e(p­attern)
Precompile a pattern for reuse
Regex Expres­sions used in Python:
.
Matches any character except newline, use like this: a.b.c to match a1b7c, asbfc, a9bkc etc.
?
Use after a character to define it occurs 0 or 1 times
\
To define a Regex pattern / Escape character
*
Use after a pattern to define 0 or more repeti­tions
+
Use after a pattern to define 1 or more repeti­tions
^
Use before a pattern to define start of a string
$
Use after a pattern to define end of string
{k}
Use to define k number of repeti­tions for a pattern
{k, l}
Use to define between k and l repeti­tions
[]
define a list of characters and use if you match from one of them
\d
Specifies digits [0-9]
\D
Anything that's not a digit
\w
Any word character [a-zA-­Z0-9_]
\W
Anything that is not a word character
\s
Whitespace \ Spacing between two words
\S
Non-wh­ite­space
\b
Word boundary, used to match whole words only like: \bcat\b to match 'cat', 'little cat' and not 'tomocat' or 'catatine'
\B
Non-word boundary, best used to match a word which has that letter like: \bun\b matches 'unmal­ici­ous', 'unnasty' and not 'un' or 'we un'

Shutil module

shutil.co­py(src, destin­ation)
Copies file to destin­ation
shutil.co­py2­(src, dst)
It's like copying but preserves metadata
shutil.co­pym­ode­(src, dst)
Copies file permis­sions only
shutil.co­pys­tat­(src, dst)
Copies file's metadata only
shutil.co­pyt­ree­(src, dst
Copies entire directory tree
shutil.mo­ve(src, dst)
Moves or renames a file
shutil.rm­tre­e('­dir')
Deletes directory and everything in it
shutil.ma­ke_­arc­hiv­e(b­ase­_name, format, root_dir)
Creates archive in any format
shutil.un­pac­k_a­rch­ive­(fi­lename, extrac­t_dir)
Unpacks the archive
shutil.di­sk_­usa­ge(­path)
Gets disk usage stats

Bokeh module

from bokeh.p­lo­tting import figure, show
from bokeh.io import output­_file, output­_no­tebook
from bokeh.l­ayouts import column, row
output­_fi­le(­"­plo­t.h­tml­")
Output to html file
output­_no­teb­ook()
Output to Jupyter notebook
p = figure­(ti­tle­="Simple Line", x_axis­_la­bel­='x', y_axis­_la­bel­='y')
Label the figure
p.line([1, 2, 3], [4, 6, 2])
Line plot
show(p)
Show the chart
p.circ­le(x, y, size=10)
Scatter plot
p.vbar­(x=x, top=y, width=0.5)
Vertical bar plot
p.hbar­(x=x, top=y, width=0.5)
Horizontal bar plot
p.tria­ngle(x, y, size=12, color=­"­gre­en")
Shape plot, other glyphs available ex: square, diamond etc.
p.titl­e.text = "­Custom Title"
Set title
p.xaxi­s.a­xis­_label = "X Axis"
Label x-axis
p.yaxi­s.a­xis­_label = "Y Axis"
Label y-axis
p.back­gro­und­_fi­ll_­color = "­lig­htg­ray­"
Set background color
p.bord­er_­fil­l_color = "­whi­tes­mok­e"
Set border color
p.outl­ine­_li­ne_­color = "­bla­ck"
Set outline line color
p.line(x, y, legend­_la­bel­="My Line", line_w­idth=2)
define legend­_label for legend
p.lege­nd.l­oc­ation = "­top­_le­ft"
Set intera­ctive legend
p.lege­nd.c­li­ck_­policy = "­hid­e"
layout = row(p1, p2)
To set layout of a row
layout = column(p1, p2)
To set layout of a column
show(l­ayout)
Show layout
from bokeh.m­odels import Column­Dat­aSource
source = Column­Dat­aSo­urc­e(d­ata­={'x': [1, 2, 3], 'y': [4, 6, 5]})
Set a data source
p.circ­le(­x='x', y='y', source­=so­urce, size=10)
Plot a circle chart from data source
from bokeh.i­o.e­xport import export_png
export­_png(p, filena­me=­"­plo­t.p­ng")
Export chart to png file
p1.x_range = p2.x_range
Link x-axis
p1.y_range = p2.y_range
Link y-axis
from bokeh.e­mbed import components
script, div = compon­ents(p)
Use in html templates

Numpy module

np.arr­ay([1, 2, 3], [4, 5, 6])
Creating a 2D array
np.zer­os((3, 3))
3x3 array of zeros
np.one­s((3, 3))
3x3 array of ones
np.ful­l((2, 2), 7)
2x2 array of sevens
np.eye(3)
Identity matrix 3x3
np.arr­ange(0, 10, 2)
An array of this: [0, 2, 4, 6, 8]
np.lin­spa­ce(0, 1, 5)
5 values from 0 to 1
arr.shape
Dimensions of the array
arr.ndim
No. of dimensions
arr.size
Total no. of elements
arr.dtype
Data type
arr.re­sha­pe((2, 3))
Reshape an array to 2x3
arr.ra­vel()
Compress an array to 1D
arr.T
Transpose the array
np.add(a, b)
a + b
np.sub­tra­ct(a, b)
a - b
np.mul­tip­ly(a, b)
a * b
np.div­ide(a, b)
a / b
np.pow­er(a, 2)
a to the power of 2
np.sqrt(a)
Square root of a
np.exp(a)
Expone­ntial value of a
np.log(a)
Natural log of a
np.mea­n(list)
Mean of the list
np.med­ian­(list)
Median of the list
np.std­(list)
Standard deviation of the list
np.sum­(list)
Sum of the list
np.max­(list)
Maximum value in a list
np.min­(list)
Minimum value in a list
np.arg­max­(list)
Index of maximum value
np.arg­min­(list)
Index of minimum value
np.con­cat­ena­te([a, b])
Join arrays
np.vst­ack([a, b])
Stack vertically
np.hst­ack([a, b])
Stack horizo­ntally
np.spl­it(a, 3)
Split the array into 3 parts
np.uni­que(a)
Unique elements of the array
np.ran­dom.ra­nd(2, 2)
a 2x2 array of random elements from 0 to 1
np.ran­dom.ra­ndn(2, 2)
a 2x2 array of random elements, this will be a normal distri­bution
np.ran­dom.ra­ndi­nt(0, 10, size=5)
a 1D array of 5 random integers from 0 to 10
np.isn­an(a)
Check for NaN values
np.isi­nf(a)
Check for Inf values
np.nan­_to­_num(a)
Convert NaN to 0
np.clip(a, 0, 1)
Limit values between 0 to 1
np.where(a > 0, 1, 0)
Condit­ional values
np.cum­sum(a)
Cumulative sum
np.cum­prod(a)
Cumulative product

Pytest module

assert result == k
checks if the result variable is the same as the variable assigned as k
@pytes­t.f­ixture
to define a fixture to use as a reusable piece of code to use before or after a test
@pytes­t.m­ark.pa­ram­etr­ize­("a, b, result­", [(1, 2, 3), (4, 5, 9)])
checks the result variable with a and b by performing numerous tests based on the data we give
@pytes­t.m­ark.sk­ip(­rea­son­="Not implem­ented yet")
skip a particular test
@pytes­t.m­ark.sk­ipi­f(c­ond­ition, reason­="...")
skip the test given the condition
@pytes­t.m­ark.xfail
If you are expecting a test to fail
pytest.ra­ises()
to raise a specific type of error

Types of data structures

Lists
Indexing, Slicing, Extending and Mutabi­lity, syntax: my_list = [1, 1.21, "­hel­lo", True]
Tuples
Indexing, Slicing and Immutable, syntax: my_tuple = (1, 10, "­hel­lo")
Sets
Unordered nature, Key operations are add(), remove(), union(), inters­ect­ion(), differ­ence(), syntax: my_set = {1, 2, 3, 3}
Dictionary
Accessing values by key, Mutability and flexib­ility, common operations are get(), items(), keys(), values(), update(), syntax: my_dict = {"na­me": "­Ali­ce", "­age­": 30, "­cit­y": "New York"}