Cheatography

# Introduction to Package R_Sheet Cheat Sheet by Noella

### Util Functions

 getwd() gets working dir setwd(­­"­C­:/­­fil­­e/­p­a­th­­") set working dir help.s­­tart() open help instal­­l.p­­a­ck­­age­­s(­"­p­a­cka­ge") install package librar­­y(­"­p­a­cka­ge") make content available detach­­("p­a­c­kag­e") detach package x=read.c­­sv(­­fi­l­e.c­­hoo­­se()) import data ls() list the variables str(var) structure of variable rm(var) remove variable

### Arrays and Matrix

 1D = array(­­1:24) 1-D array 2D=arr­ay(­­1:­24,­dim­=c(­6,4)) 2-D array 3D=arr­ay(­­1:­24,­dim­=c(­4,3,2)) 3-D array matrix­­(1­:12­,nr­ow=­4,n­col=3) matrix rbind/­cbi­nd(­­ma­t­1­,mat2) row/col bind t(mat) transpose

### Descri­­ptive Statistics

 rowMea­­ns­(­d­ata[])/ colMea­­ns­(­d­ata[]) row/ column mean rowSum­­s(­d­a­ta[])/ colSum­­s(­d­a­ta[]) row / column sum

### Graphical Plots

 qplot(­­data, line=T­­RU­E­,...) produces quanti­­le­-­q­ua­­ntile plot ggplot­­(data = NULL, mapping = aes(), ...) initia­­lizes a ggplot object geom_bar() bar graph coord_­­flip() flip x and y coordi­­nates facet_­­grid() lay out panels in a grid geom_d­­en­sit­y/h­ist­/point densit­y/h­ist­ogr­am/­scatter plot

### Strings

 toStri­­ng(x) produces a single character string touppe­­r(­)­/­to­­lower() converts to upper/­­lower case substr­­in­g­(­ch­­r,n,n) retrie­ves­/re­places the substring paste (…, sep= " ", collap­se=­NULL) Convert to character + Concat­­enate

### Vector

 num = c(1,2,­­3,­4­,5,6) numeric vector chr = c("a­­aa­"­,­"­­bbb­­") character vector log = c(TRUE­­,T­R­U­E,­­FALSE) logical vector mean(vec) mean sd(vec) standard deviation var(vec) variance range(vec) range which.m­­i­n­(­ve­­c)/­­wh­i­c­h.m­­ax­­(vec) position of the min/max value rep(1:­­5,­t­i­mes=3) Replicate elements of vector

### Probab­­ility Distri­butions

 rbinom(n, size, prob) Binomial rpois(­­n,­size) Poisson runif(n, min = 0, max = 1) Uniform rnorm(­­n,­m­e­an,sd) Normal rexp(n) Expone­­ntial

### Data Frames

 df = data.f­­ra­m­e­(s­­ubj­­ec­t­I­D=­­1:5­­,g­e­n­de­­r=c­­("M­"­,­"­­F","M­­"­,­"­M",­­"­­F")­,­s­co­­re=­­c(­8­,­3,­­6,5,5)) Created data frames in R fw = read.c­sv(­fil­e.c­hoo­se()) Importing data by choosing a file grass = read.c­sv(­'C:­/Us­ers­/Do­wnl­oad­s/g­ras­s.csv') Importing data by specifying paths view(df) opens editor rbind(­a_d­ata­_frame, anothe­r_d­ata­_frame) Bind rows/ columns of frames merge(­frame1, frame2, by = "­x") Merge 2 data frames summar­­y(df) returns descri­­ptive statistics of data

### Loops

 if (condi­­tion){ Do something } else { Do something different } ifelse statement while (condi­­tion){ Do something } while loop for (variable in sequence){ Do something } for loop

### Hypothesis testing

 t.test­­(data) 1 sample t test t.test­­(d­a­t­a1­­,data2) 2 sample t test t.test­­(p­r­e­,p­­ost­­,p­a­i­re­­d=TRUE) paired sample t test wilcox.te­­st­(­data) Wilcox test cor.te­­st­(­d­at­­a1,­­data2) correl­­ation test chisq.t­­e­s­t­(data) Chi square test shapir­­o.t­­e­st­­(data) Shapiro test aov() ANOVA summar­­y(lm(y ~ x1 + x2 + x3, data=m­­yd­ata)) multiple regression summar­­y(­glm(y ~ x1 + x2 + x3, family­­="", data=m­­yd­ata)) classi­­fi­c­ation cluster = kmeans­­(data) cluster analysis