Show Menu
Cheatography

Exploratory data analysis in R Cheat Sheet (DRAFT) by

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Barplot

barplo­t(h­eight)
where
height
is vector or matrix
Options:
horiz=TRUE
,
main
,
xlab
,
ylab
,
names.arg
If
height
is matrix, stacked and grouped bar plot is produced
Options:
beside­=TRUE
,
col
,
legend
If
height
is factor or ordered factor, use
plot()
function

Spinograms

spine()
function in
vcd
package

Pie charts

Pie charts

Fan plot

fan.plot()
in
plotrix
package

Fan plot

Histogram

Histogram

Kernel density plot

Kernel density plot

 

Kernel density plot

Kernel density plot

Box plots

Five-n­umber summary
Minimum, lower quartile (25th percen­tile), median (50th percen­tile), upper quartile (75th percen­tile), maximum
Outliers
Values outside the range of +- 1.5*IQR
Example
boxplo­t(m­tca­rs$mpg, main="Box plot", ylab="Miles per Galleo­ns")
 
boxplo­t.s­tat­s(m­tca­rs$mpg)
Parallel box plots for comparison
boxplo­t(f­ormula, data=d­ata­frame)

Box plots

boxplo­t(mpg ~ cyl, data=m­tcars, main="Car Mileage Data", xlab="N­umber of Cylind­ers­", ylab="Miles per Gallon­")

Box plots

boxplo­t(mpg ~ cyl, data=m­tcars, notch=­TRUE, varwid­th=­TRUE, col="re­d", main="Car Mileage Data", xlab="N­umber of Cylind­ers­", ylab="Miles per Gallon­")
 

Box plots (2 crossed factors)

Box plots (2 crossed factors)

Violin plots

Violin plots are kernel density plots superi­mposed in a mirror image fashion over box plots
vioplo­t(x1, x2, … , names=, col=)

Violin plots

librar­y(v­ioplot)

x1 <- mtcars­$mp­g[m­tca­rs$­cyl==4]

x2 <- mtcars­$mp­g[m­tca­rs$­cyl==6]

x3 <- mtcars­$mp­g[m­tca­rs$­cyl==8]

vioplo­t(x1, x2, x3, names=­c("4 cyl", "6 cyl", "8 cyl"), col="go­ld")

title(­"­Violin Plots of Miles Per Gallon­")

Dot plots

dotcha­rt(­mtc­ars­$mpg, labels­=ro­w.n­ame­s(m­tcars), cex=.7, main="Gas Mileage for Car Models­", xlab="Miles Per Gallon­")

Dot plots - grouped, sorted and colored

x <- mtcars­[or­der­(mt­car­s$m­pg),]

x$cyl <- factor­(x$cyl)

x$colo­r[x­$cy­l==4] <- "­red­"

x$colo­r[x­$cy­l==6] <- "­blu­e"

x$colo­r[x­$cy­l==8] <- "­dar­kgr­een­"

dotcha­rt(­x$mpg, labels = row.na­mes(x), cex=.7, groups = x$cyl, gcolor = "­bla­ck", color = x$color, pch=19, main = "Gas Mileage for Car Models­\ng­rouped by cylind­er", xlab = "­Miles Per Gallon­")