Show Menu
Cheatography

Data viz in R Cheat Sheet (DRAFT) by

GGPLOT2 has a number of geom functions to use. Find a summary of them here.

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Geoms

What is a Geom?
A geom is a geometric object and is a function that controls the way in which your data is visual­ized.

Basic Graph Features:

geom_b­lank():
Creates a blank canvas
geom_p­ath():
Data points are joined according to how they are ordered in the data
geom_l­ine():
Data points are connected according to the order on the x axis
geom_r­ibb­on():
A line graph that has an area highli­ghted above and below the line. The thickness of this highli­ghted part is defined by a y-min and y-max
geom_s­egm­ent():
Connects 2 data points with a line segment
geom_r­ect():
Create rectangles
geom_p­oly­gon():
Create polygons
geom_t­ext():
Add labels and text

Single variab­les

Discrete:
geom_b­ar():
Create a bar graph
Contin­uous:
geom_h­ist­ogr­am():
Create a histogram (to show distri­bution of a continuous variable)
geom_d­ens­ity():
Create a density plot ( a smoothed version of a histogram)
geom_d­otp­lot():
Each dot represents an observ­ation where the size of the dot is the bin width
geom_f­req­poly():
A frequency polygon for when you want to compare the distri­bution of various elements in a category. An altern­ative to stacking histog­rams. With a histogram you display the number of observ­ations using a bar, but with a frequency polygon you use lines.

Two variables:

Both contin­uous:
geom_p­oint():
Scatte­rplot
geom_q­uan­tile():
Drawing a line through a regression
geom_s­moo­th():
Add a line of best fit
Show distri­bution:
geom_b­in2d():
Creates a heatmap - as an altern­ative to geom_point if too many points
geom_d­ens­ity­2d():
Creates a 2D density plot
geom_h­ex():
An altern­ative to geom_b­in2d() but the bins are hexagons
At least one discrete:
geom_c­ount():
When there are too many points in a specific location on your plot, you can count them and create a group. This helps when there are too many data points to plot effect­ively (this is to prevent overpl­otting)
geom_j­itt­er():
Adds random variation (dots) at each data point
One contin­uous, one discrete:
geom_b­ar(stat = "­ide­nti­ty"):
geom_bar uses stat="b­in" as its default making the height of each bar equal to the number of cases in each group. If you want the heights of the bars to represent values in the data, use stat="i­den­tit­y" and give the y aesthetic a value.
geom_b­oxp­lot():
Box plots
geom_v­iol­in():
Violin plot (like a box plot but instead of a box, you have the shape of how the data is distri­buted)
One time, one continuous
geom_a­rea():
Area plot
geom_l­ine():
Line plot
geom_s­tep():
Step plot - Connects data points as they change creating a line that looks like a staircase
Spatial:
geom_m­ap():
Create a map with geogra­phical data