Show Menu
Cheatography

R BASIC Cheat Sheet (DRAFT) by

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Basic of basic

print
no need any other exp
expone­nti­ation
^
modulo
%%
assignment
= or <-
equal sign
==
case sensitive
YES
or
|
double operators
&&, || only examine 1st elements in vectors

Factor

factor()
2 types
nominal vs. ordinal

Data structures

vector
c() one dim-array, same data type, arithmetic calcul­ations OK, names()
matrix
two dim-array, same data type, arithmetic calcul­ations OK
data frame
two dim-ob­ject, different data types
list
1 dim object, different data types
df is a list where each element is a column so df[[col]] equals column

FUNCTION

fun <- functi­on(­arg1, arg2) {...}
3 elements: arguments, body, enviro­nment
return() stop execution and return a value, used to return early for special cases not routinely
functions are objects
descri­ptive, use consistent naming conven­tion, no overriding existing object names, data args come first
functions can be arguments to other functions --> functional progra­mming
pipe operator: %>%
safely(), possib­ly(), quietly() - handling errors
side effects: beyond the results of a function, i.e. print output, plot, save files to disk
3 main problems: type-u­nstable functions( [, sapply), non-st­andard evalua­tion, hidden arguments
stop("error msg", call. = FALSE)
view global options: options(), getOpt­ion­("di­git­s")
purr - map(), map_dbl(), map_int(), etc. apply the same function to a list. Similar to the apply family - sapply(), lapply()
map2(.x, .y, .f, ...), pmap(.l, .f, ...) - multiple args
invoke­_ma­p(.f, .x = list(N­ULL), ...) - multiple functions
walk() similar to map() but for side effects
 

Basic functions

class()
str()
rbind(), cbind()
colSums(), rowSums(), nrow(), ncol()
data.f­rame(), subset()
order(), arrange(), rank()
file.p­ath()
return file path
seq_al­ong()
generate sequence for "for loop" can handle empty cases

Subsetting

starts from 1
[2:5] equals to
c(2,3,4,5)
df[ ,1]
1st column
df[1, ]
1st row
df[2,3]
2nd row & 3rd column
df$x
select element name x
ls[x] vs. ls[[x]] or $
sublist vs. element

IF Syntax

if (condition1) {
  ex1
} else if (condition2) {
  ex2
} else {
  ex3
}

FOR syntax

for (i in list) {
  print(i)
}

for (i in 1:length(list)) {
  print(list[[i]])
}

break & next

break - abandons the active loop: the remaining code in the loop is skipped and the loop is not iterated over anymore.
next - skips the remainder of the code in the loop, but continues the iteration.
 

Importing data

read.c­sv(), read.d­elim()
read.c­sv2(), read.d­elim2() - local diff. in decimal convention
read.t­abl­e("f­oo.t­xt­", header = TRUE, sep = "­/", string­sAs­Factors = FALSE)
readxl - read_e­xce­l("f­oo.x­ls­x", sheet=­bar); lapply­(ex­cel­_sh­eet­s("f­oo.x­ls­x"), read_e­xcel, path = "­foo.xl­sx")
XLConnect - loadWo­rkb­ook(), getShe­ets(), readWo­rks­heet() & a lots more
read SQL - DBI, dbList­Tables, dbRead­Table, dbGetQ­uer­y(e­ffi­cient!)
read JSON - jsonlite: fromJSON()
read stats. software - haven, foreign
packages: readr - similar to utils but simpler; data.t­abl­e:fread - fast, convenient