Show Menu
Cheatography

R Cheat Sheet (DRAFT) by

This is a draft cheat sheet. It is a work in progress and is not finished yet.

CHEAT SHEET FOR R

By
Nanditha T (F17095)
Sanjana S (F17109)
Vivin Pearl Kishore (F17119)

Util functions

getwd()
gets the working directory
setwd(­'c:­//f­ile­/path')
sets the working directory
ls()
list all the variables
rm(var­_name)
removes variable name
str(va­riable name)
displays the structure
help.s­tart()
opens help
instal­l.p­ack­age­s("p­ack­age­_na­me")
installs packages
librar­y("p­ack­age­_na­me")
makes the content available to use
detach­("pa­cka­ge_­nam­e")
detaches the package
history()
displays history

Data Structures

Vectors
d=c(3,4,5)
Arrays
2D = array(­­1:24, dim = c(6,4))
Matrices
mat = matrix­­(1:12, nrow=4, ncol=3)
Lists
list_data <- list("R­ed", "­Gre­en", c(21,3­2,11), TRUE, 5, 3)
Dataframe
df = data.f­­ra­m­e­(s­­ubj­­ec­t­I­D=­­1:5­­,g­e­n­de­­r=c­­("M­"­,­"­­F","M­­"­,­"­M",­­"­­F")­,­s­co­­re=­­c(­8­,­3,­­6,5,5))

Vector

num = c(1,2,­­­3­,­4­­,5,6)
numeric vector
chr = c("a­­­a­a­"­­,­"­­­bb­b­­­")
character vector
log = c(TRUE­­­,­T­R­­U­­E,­­­FALSE)
logical vector
which.m­­­­i­n­­(­­ve­­­c)­/­­­wh­­i­c­­h.m­­­­ax­­­(vec)
position of the min/max value
rep(1:­­­5­,­t­­i­­mes=3)
Replicate elements of vector

Arrays

1D = array(­­­1:24)
1-D array
2D=arr­­ay­(­­­1:­­24,­­di­m­=­c(­­6,4))
2-D array
3D=arr­­ay­(­­­1:­­24,­­di­m­=­c(­­4,3,2))
3-D array

Matrix Functions

t(m)
transpose
m %*% n
matrix multip­lic­ation
solve(m,n)
find x in m*x = n
det(m)
determ­inant
m*n
dot product
rbind/­­cb­i­n­d(­­­ma­­t­­1­,­mat2)
row/column bind

Data Frames

df = data.f­­­r­a­m­­e­­(s­­­ub­j­­­ec­­t­I­­D=­­­1­:5­­­,g­­e­­n­d­e­­­r=c­­­(­"­M­"­­,­"­­­F",­"­M­­­"­­,­­"­­M",­­­"­­­F")­­,­­s­c­o­­­re=­­­c­(­8­­,­­3,­­­6,­5,5))
Created data frames in R
fw = read.c­­sv­(­f­il­­e.c­­ho­o­se())
Importing data by choosing a file
grass = read.c­­sv­(­'­C:­­/pa­th­­/sa­mpl­e.csv')
Importing data by specifying paths
view(df)
opens editor
rbind(­­a_­d­a­ta­­_frame, anothe­­r_­d­a­ta­­_frame)
Bind rows/ columns of frames
merge(­­fr­ame1, frame2, by = "­­x")
Merge 2 data frames

Descri­ptive Statistics

rowMea­­ns­(­d­ata[])
row mean
rowSum­­s(­d­a­ta[])
row sum
colMea­­ns­(­d­ata[])
column mean
colSum­­s(­d­a­ta[])
column sum

Data type Conversion

Use is.foo to test for data type foo. Returns TRUE or FALSE
Use as.foo to explicitly convert it
is.num­eric(), is.cha­rac­ter(), is.vec­tor(), is.mat­rix(), is.dat­a.f­rame()
as.num­eric(), as.cha­rac­ter(), as.vec­tor(), as.mat­rix(), as.dat­a.f­rame()
 

Creating a Function

function_name <- function(arg_1, arg_2, ...) {
   Function body 
}
Functions are followed by parant­hesis

String functions

toStri­ng(x)
produce a character string
noquote(x)
print character strings without quotes
sprintf()
returns a character vector containing a formatted combin­ation of text and variable values
cat()
converts into strings and concat­enates
toupper() / tolower()
converts text to upperc­ase­/lo­wercase
substr­(x,­fir­st,­last)
extracts parts of a string
strspl­it(x, split, fixed = FALSE, perl = FALSE, useBytes = FALSE)
split elements of a string into substrings
paste(..., sep = " ", collapse = NULL)
concat­enate strings

Factor functions

factor()
it is used to encode a vector as a factor (the terms ‘category’ and ‘enume­rated type’ are also used for factors)
levels()
it provides access to the levels attribute of a variable
nlevels()
Return the number of levels which its argument has.
relevel()
The levels of a factor are re-ordered so that the level specified by ref is first and the others are moved down
unique()
it returns a vector, data frame or array like x but with duplicate elemen­ts/rows removed.
drople­vels()
The function droplevels is used to drop unused levels from a factor or, more commonly, from factors in a data frame
cut()
cut divides the range of x into intervals and codes the values in x according to which interval they fall

Date Time functions

Sys.time()
returns today's date
date()
returns current date and time
as.POS­IXlt()
convert an object to one of the two classes used to represent date/times
as.Date()
convert character data to dates
strptime()
onverts character vectors to class "­POS­IXl­t": its input x is first converted by as.cha­racter
strftime()
a wrapper for format.PO­SIXlt, and it and format.PO­SIXct first convert to class "­POS­IXl­t" by calling as.POSIXlt

Flow control functions

if(con­dit­ion){ //execute when condition is true}
if(con­dit­ion­){/­/ex­ecute when condition is true} else()­{//­execute when condition is false}
if(con­dition 1) { // Executes when the condition 1 is true} else if( condition 2) { // Executes when the condition 2 is true. } else if( condition 3) { // Executes when the condition 3 is true} else { // executes when none of the above condition is true}
ifelse­(co­ndi­tion, x, y)
switch­(ex­pre­ssion, case1, case2, case3....)

Loop functions

while (condi­­tion){ Do something }
for (variable in sequence){ Do something }
apply(), lapply(), sapply()
A loop statement allows us to execute a statement or group of statements multiple times based on the condition

File format functions

read.csv()
To read the data
read.t­able()
To read the table contents
read.x­lsx2()
To read data from excel sheet
 

Data summary functions

summary()
returns descri­ptive statistics of data
str()
structure of the variable
describe()
determines the type of a single variable and prints a concise statis­tical summary
class()
a simple generic function mechanism which can be used for an object­-or­iented style of progra­mming
dim()
Dimension
head()
Returns the first or last parts of a vector, matrix, table, data frame or function.
names()
Functions to get or set the names of an object.
View()
Invoke a spread­she­et-­style data viewer on a matrix­-like R object.
subset()
Return subsets of vectors, matrices or data frames which meet condit­ions.

Visual­ization functions

par(mf­row­=c(­2,2))
create a matrix of nrows
barplot()
Relati­​onship between a numerical and a catego­​rical variable
pie()
piecharts
mosaic­plot()
Plots a mosaic on the current graphics device
hist()
Histogram
plot()
simple scatter plots
plot(d­ens­ity())
Density plots. non-pa­ram­etric way to estimate the probab­ility density function of a random variable
pairs()
A matrix of scatte­rplots is produced
matplot()
Plot the columns of one matrix against the columns of another.
boxplot()
Distri­​bution
qqnorm()
produces quanti­​​l­e​-­​q​­ua​­​ntile plot
qplot()
produces quanti­​​l­e​-­​q​­ua​­​ntile plot
ggplot­(my­data1, aes(x = 1, fill = subject) ) + geom_bar()
Intializes a ggplot object

Probab­ility Distri­butions

Central tendency and Dispersion

mean()
find mean
median()
find median
range()
find range
sd()
find standard deviation
var()
find variance
cor()
find correl­ation

Hypothesis Testing

t.test­­(data)
1 sample t-test
t.test­­(d­a­t­a1­­,data2)
2 sample t-test
t.test­­(p­r­e­,p­­ost­­,p­a­i­re­­d=TRUE)
paired sample t-test
wilcox.te­­st­(­data)
Wilcox test
cor.te­­st­(­d­at­­a1,­­data2)
Correl­ation test
chisq.t­­e­s­t­(data)
Chi square test
shapir­­o.t­­e­st­­(data)
Shapiro test
aov()
ANOVA

Algorithms - statistics

summar­­­y­(lm(y ~ x1 + x2 + x3, data=m­­­y­d­ata))
multiple regression
summar­­­y­(­glm(y ~ x1 + x2 + x3, family­­­=­"­", data=m­­­y­d­ata))
classi­fic­ation
cluster = kmeans­­­(­data)
clustering