Show Menu
Cheatography

Seaborn Cheat Sheet (DRAFT) by

Data Visualization using Seaborn

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Introd­uction to Seaborn

Seaborn
Seaborn is a Python visual­ization library based on matplotlib that provides a high-level interface for drawing attractive statis­tical graphics. It is built on top of matplotlib and closely integrated with pandas data struct­ures, making it an excellent tool for exploring and visual­izing datasets.
Key Features
Simplified syntax for creating complex visual­iza­tions.
Built-in themes and color palettes to improve the aesthetics of plots.
Support for a wide range of statis­tical plots for exploring relati­onships in data.
Seamless integr­ation with pandas DataFrames for easy data manipu­lation and visual­iza­tion.
Capabi­lities for both univariate and multiv­ariate visual­iza­tions.
Integr­ation with matplotlib for fine-t­uning and custom­iza­tion.
Getting Started
Install Seaborn using pip: pip install seaborn.
Import Seaborn in your Python script or Jupyter Notebook: import seaborn as sns.
Load your data into pandas DataFrame if not already in one.
Start exploring your data using Seaborn's high-level plotting functions.

Installing Seaborn

Using pip
pip install seaborn
Using conda
conda install seaborn
Verify Instal­lation
import seaborn as sns

Loading Data

Using Pandas
import pandas as pd 
df = pd.rea­d_c­sv(­'fi­len­ame.csv')
# Load CSV file
Viewing Data
df.head()  # View first few rows
Unders­tanding Data
df.info()   
# Summary of DataFrame
df.describe()
# Descri­ptive statistics
Handling Missing Data
df.dro­pna()  
# Drop rows with missing values
df.fillna(value)
# Fill missing values
Loading Built-in Datasets
import seaborn as sns 
df = sns.lo­ad_­dat­ase­t('­dat­ase­t_n­ame')

Basic Plotting Functions

sns.sc­att­erp­lot(x, y, data)
Create a scatter plot to visualize the relati­onship between two variables.
sns.li­nep­lot(x, y, data)
Generate a line plot to show trends in data over continuous intervals.
sns.ba­rpl­ot(x, y, data)
Construct a bar plot to display the distri­bution of catego­rical data.
sns.co­unt­plot(x, data)
Plot the frequency of unique values in a catego­rical variable.
sns.bo­xpl­ot(x, y, data)
Draw a box plot to summarize the distri­bution of a continuous variable within different levels of a catego­rical variable.
sns.vi­oli­npl­ot(x, y, data)
Create a violin plot to visualize the distri­bution of a continuous variable across different catego­ries.
sns.hi­stp­lot­(data, x)
Generate a histogram to display the distri­bution of a single variable.

Custom­izing Plots

Changing Colors
Use the color parameter to specify colors for elements such as lines, markers, and bars. Seaborn also provides color palettes (palette parameter) for different visual­iza­tions.
Adjusting Line Styles and Markers
Control the style of lines with the linestyle parameter and markers with the marker parameter. Options include solid lines ('-'), dashed lines ('--'), and various marker shapes ('o', 's', 'D', etc.).
Setting Plot Size
Use the plt.fi­gur­e(f­igs­ize­=(w­idth, height)) function to specify the size of your plot. Adjust the width and height values as needed to achieve the desired dimens­ions.
Adding Titles and Labels
Set the title of your plot with plt.ti­tle() and label the axes with plt.xl­abel() and plt.yl­abel(). Provide inform­ative titles and labels to make your plots more unders­tan­dable.
Changing Font Sizes
Customize font sizes for titles, labels, and ticks using parameters such as fontsize or by accessing individual text elements.
Adjusting Axis Limits
Control the range of values displayed on the x and y axes using plt.xlim() and plt.ylim() functions. Set approp­riate limits to focus on specific regions of interest in your data.
Adding Grid Lines
Use plt.gr­id(­True) to display grid lines on your plot, aiding in data interp­ret­ation.
Adding Legends
Include a legend to distin­guish between multiple elements in your plot using the plt.le­gend() function. Customize the legend labels and placement for clarity.

Saving Plots

Syntax
import seaborn as sns  
# Create your plot here
sns.savefig("filename.extension")
Example
import seaborn as sns 
import matplo­tli­b.p­yplot as plt
# Create a scatter plot
sns.scatterplot(x='x', y='y',
data=data)
# Save the plot as a PNG file
plt.savefig("scatter_plot.png")
Supported File Formats
PNG (Portable Network Graphics)
JPG/JPEG (Joint Photog­raphic Experts Group)
PDF (Portable Document Format)
SVG (Scalable Vector Graphics)
and more

Catego­rical Plots

barplot()
Displays the central tendency and confidence interval of numeric variables across different catego­ries. Useful for comparing the mean or aggregate statistic of numeric data for each category.
countp­lot()
Shows the count of observ­ations in each category using bars. Suitable for exploring the distri­bution of catego­rical variables.
boxplot()
Visualizes the distri­bution of quanti­tative data across different levels of one or more catego­rical variables. Useful for identi­fying outliers and comparing distri­but­ions.
violin­plot()
Combines the benefits of a box plot and a kernel density plot. Provides inform­ation about the distri­bution of data within each category.
stripp­lot() and swarmp­lot()
Scatte­rplots for catego­rical data. Show individual data points along with a catego­rical variable. Swarmplot avoids overla­pping points by adjusting them along the catego­rical axis.
pointp­lot()
Represents the point estimates and confidence intervals using lines. Useful for visual­izing the relati­onship between two catego­rical variables.
factor­plot() (depre­cated, use catplot() instead)
A versatile function that can create different types of catego­rical plots based on the kind parameter. Offers a convenient way to explore relati­onships between variables.
catplot()
Replaces factorplot and serves as a general plot function for catego­rical data. Supports various plot types such as stripplot, swarmplot, boxplot, etc., through the kind parameter.
 

Distri­bution Plots

Distri­bution Plots
Distri­bution plots in Seaborn allow you to visualize the distri­bution of a dataset. These plots help you understand the underlying distri­bution of your data, including its central tendency, spread, and skewness.
Histograms
sns.hi­stp­lot­(data, x='col­umn'): Plot a histogram of the specified column in the dataset. Customize with parameters like bins, kde, color, and alpha.
Kernel Density Estimation (KDE) Plots
sns.kd­epl­ot(­data, x='col­umn'): Generate a smooth estimate of the probab­ility density function. Additional parameters include bw_method, fill, and common­_norm.
Rug Plots
sns.ru­gpl­ot(­data, x='col­umn'): Plot a line for each data point along the x-axis. Useful for visual­izing individual data points in combin­ation with other plots.
Cumulative Distri­bution Function (CDF)
sns.ec­dfp­lot­(data, x='col­umn'): Plot the empirical cumulative distri­bution function. Helps to visualize the cumulative proportion of data points.
Joint Distri­bution Plots
sns.jo­int­plo­t(d­ata­=data, x='x_c­olumn', y='y_c­olumn', kind='­kind'): Plot the joint distri­bution of two variables along with their marginal distri­but­ions. kind parameter can be set to scatter, kde, hist, hex, or reg for different visual­iza­tions.
Pair Plots
sns.pa­irp­lot­(data): Create pairwise plots for all numerical columns in the dataset. Offers a quick overview of relati­onships between multiple variables.
Violin Plots
sns.vi­oli­npl­ot(­dat­a=data, x='x_c­olumn', y='y_c­olu­mn'): Visualize the distri­bution of a numeric variable for different catego­ries. Provides insights into both the distri­bution and the probab­ility density at different values.
Box Plots
sns.bo­xpl­ot(­dat­a=data, x='x_c­olumn', y='y_c­olu­mn'): Summarize the distri­bution of a numeric variable for different categories using quartiles. Helps to identify outliers and compare distri­butions between catego­ries.
Swarm Plots
sns.sw­arm­plo­t(d­ata­=data, x='x_c­olumn', y='y_c­olu­mn'): Show each data point along with the distri­bution. Useful for small to modera­te-­sized datasets.
Violin­-Swarm Combin­ation
Combining violin and swarm plots can provide a compre­hensive view of the distri­bution and individual data points.

Regression Plots

Regression Plots
Regression plots in Seaborn are useful for visual­izing relati­onships between variables and fitting regression models to the data. Seaborn provides several functions for creating regression plots, allowing you to explore linear relati­ons­hips, examine residuals, and detect outliers.
lmplot()
Used for plotting linear models. Syntax: sns.lm­plot(x, y, data, ...). Displays scatter plot with a linear regression line. Useful for visual­izing the relati­onship between two variables and assessing the fit of a linear model.
regplot()
Similar to lmplot() but can be used in more general contexts. Syntax: sns.re­gpl­ot(x, y, data, ...). Produces scatter plot with a regression line. Offers additional custom­ization options compared to lmplot().
residp­lot()
Used for plotting the residuals of a linear regres­sion. Syntax: sns.re­sid­plot(x, y, data, ...). Helps to diagnose the fit of the regression model by plotting the difference between observed and predicted values. Useful for identi­fying patterns or hetero­sce­das­ticity in residuals.
Additional Parameters
order: Specifies the order of the polynomial regression (default is 1 for linear). scatte­r_kws: Additional keyword arguments passed to the scatte­rplot function. line_kws: Additional keyword arguments passed to the line plot function. ci: Confidence interval size for the regression estimate. truncate: Truncates the regression line at the data limits.
Example
import seaborn as sns 
import matplo­tli­b.p­yplot as plt
# Load sample data
tips = sns.lo­ad_­dat­ase­t("t­ips­")
# Create a regression plot
sns.lmplot(x="total_bill",
y="tip", data=tips)
# Show the plot
plt.show()

Matrix Plots

Matrix Plots
Matrix plots in Seaborn are useful for visual­izing data in matrix form, typically with heatma­p-style repres­ent­ations.
Heatmaps
Use sns.he­atmap() to create a colored matrix plot, with each cell repres­enting the value of a variable in the dataset. Ideal for displaying correl­ation matrices or any two-di­men­sional data.
Cluster Maps
sns.cl­ust­ermap() creates a hierar­chical clustering heatmap. It's handy for exploring relati­onships between variables by grouping similar ones together.
Pair Plots
Although not strictly matrix plots, sns.pa­irp­lot() generates a matrix of scatte­rplots and histograms for quick visual­ization of relati­onships between multiple variables in a dataset.
Custom­ization
Seaborn allows extensive custom­ization of matrix plots, including adjusting color schemes, annotating cells with values, and tweaking axes.

Time Series Plots

Time Series Plots
Time series plots in Seaborn are useful for visual­izing data over time. Seaborn provides several functions to create inform­ative time series plots.
seabor­n.l­ine­plot(x, y, data)
Creates a line plot of y vs. x with optional data argument. Ideal for visual­izing trends and patterns over time.
seabor­n.r­elp­lot(x, y, data, 
kind='line')
Offers a high-level interface to create various plot types, including line plots for time series data. Use the kind parameter to specify the plot type (default is 'line').
seabor­n.s­cat­ter­plot(x, y, data)
Plots individual data points as scatter points. Suitable for visual­izing relati­onships between variables over time.
seabor­n.t­spl­ot(­data, time, 
unit, value)
Deprecated since Seaborn version 0.9. Use other functions for time series visual­iza­tion.
seaborn.linearmodels.Tsplot(
data, time, unit, value)
Visualizes time series data with confidence intervals. Suitable for comparing multiple time series.

Style and Aesthetics

Seaborn Styles
seabor­n.s­et_­sty­le(­sty­le=­None): Set the aesthetic style of the plots. Styles include: 'darkg­rid', 'white­grid', 'dark', 'white', and 'ticks'.
Color Palettes
seabor­n.c­olo­r_p­ale­tte­(pa­let­te=­None, n_colo­rs=­None, desat=­None): Set the color palette for plots. Built-in palettes: 'deep', 'muted', 'bright', 'pastel', 'dark', 'color­blind', etc. Custom palettes can be created using seabor­n.c­olo­r_p­ale­tte().
Contexts
seabor­n.s­et_­con­tex­t(c­ont­ext­=None, font_s­cale=1, rc=None): Set the context parameters for the plot. Contexts control the scale of plot elements. Contexts include: 'paper', 'noteb­ook', 'talk', and 'poster'.
Plot Aesthetics
seabor­n.d­esp­ine­(fi­g=None, ax=None, top=True, right=­True, left=F­alse, bottom­=False, offset­=None, trim=F­alse): Remove axes spines from the plot. seabor­n.s­et_­pal­ett­e(p­alette, n_colo­rs=­None, desat=­None, color_­cod­es=­False): Set the color palette for the current seaborn context. seabor­n.s­et_­con­tex­t(c­ont­ext­=None, font_s­cale=1, rc=None): Set the plotting context parame­ters.
Other Aesthetic Tweaks
seabor­n.s­et(): Set aesthetic parameters in one step. seabor­n.r­ese­t_d­efa­ults(): Restore default seaborn parame­ters. seabor­n.s­et_­the­me(): Set the default seaborn theme.
Saving Aesthetic Settings
seabor­n.a­xes­_st­yle­(st­yle­=None, rc=None): Return a dictionary of parameters or use in a with statement to tempor­arily set the style. seabor­n.p­lot­tin­g_c­ont­ext­(co­nte­xt=­None, font_s­cale=1, rc=None): Return a dictionary of parameters or use in a with statement to tempor­arily set the context.