Show Menu
Cheatography

Introduction to R Cheat Sheet (DRAFT) by

Intro to basics, Vectors, Matrices, Factors, Data Frames & Lists

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Arithm­etics

Addition
+
Subtra­ction
-
Multip­lic­ation
*
Division
/
Modulo
%%
Expone­nti­ation
^
Modulo returns the remainder of the division of the number to the left by the number on its right, for example 5 modulo 3 or 5 %% 3 is 2.

Comparison operators

Less than
<
More than
>
Less than or equal to
<=
Greater than or equal to
>=
Equal to each other
==
Not equal to each other
!=

Selecting by comparison

# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector

# Which days did you make money on roulette?
selection_vector <- roulette_vector > 0

# Select from roulette_vector these days
roulette_winning_days <- roulette_vector[selection_vector]

Data Types

Decimal values
4.5
Numerics
Whole numbers
4
Integers
Boolean values
TRUE / FALSE
Logical
Text / String
"­Tex­t"
Characters
Show the data type: class(data)

Lists

Create a list
my_list <- list(e­lem­ent1, element2)
Give names to the list items
my_list <- list(name1 = your_c­omp1, name2 = your_c­omp2)
# Adapt list() call to give the components names
my_list <- list(vec = my_vector,
mat = my_matrix,
df = my_df)

#or if the list was alread created
names(­my_­list) <- c("v­ec", "­mat­", "­df")

Selecting components in a list

One way to select a component is using the numbered position of that component. For example, to "­gra­b" the first component of shinin­g_list you type

shinin­g_l­ist­[[1]]

A quick way to check this out is typing it in the console. Important to remember: to select elements from vectors, you use single square brackets: [ ]. Don't mix them up!

You can also refer to the names of the compon­ents, with [[ ]] or with the $ sign. Both will select the data frame repres­enting the reviews:

shinin­g_l­ist­[["r­evi­ews­"]]
shinin­g_l­ist­$re­views

Besides selecting compon­ents, you often need to select specific elements out of these compon­ents. For example, with shinin­g_l­ist­[[2­]][1] you select from the second component, actors (shini­ng_­lis­t[[­2]]), the first element ([1]). When you type this in the console, you will see the answer is Jack Nicholson.
 

Vector Basics

Assign value to variable
my_var <- 4
Numeric vector
numeri­c_v­ector <- c(1, 10, 49)
Charac­ter­_vector
charac­ter­_vector <- c("a­", "­b", "­c")
Boolean vector
boolea­n_v­ector <- c(TRUE, FALSE, TRUE)
Naming a vector
names(­num­eri­c_v­ector) <- c("J­ack­", "­Jil­l", "­Joh­ann­a")
Sum of the elements in the vector
sum(ve­cto­r_name)
Select element 3 of the vector
element <- vector­_na­me[3]
Select elements 2, 3, 4, 5 of the vector
elements <- vector­_na­me[3:5]

Factors

# Animals - Turn vector character elements into nominal factors
animals_vector <- c("Elephant", "Giraffe", "Donkey", "Horse")
factor_animals_vector <- factor(animals_vector)
factor_animals_vector

# Temperature - Turn vector character elements into ordinal factors
temperature_vector <- c("High", "Low", "High","Low", "Medium")
factor_temperature_vector <- factor(temperature_vector, order = TRUE, levels = c("Low", "Medium", "High"))
factor_temperature_vector
When factors are ordinal::
order = TRUE

To give the order of the ordinal factors:
levels = c("L­ow", "­Med­ium­", "­Hig­h"))

Data Frames

Show the first couple of lines
head(d­ata­_frame)
Show the last couple of lines
tail(d­ata­:frame)
Summarize data frame (min, max, median, quartiles)
summar­y(d­ata­_frame)
Structure (nr. obs, var., names, data type...)
str(da­ta_­frame)
unlike matrixes, df can have different types of data - BUT all variables need to have the same length (unlike for lists)

Create data frame from vectors + select values

# Definition of vectors
name <- c("Mercury", "Venus", "Earth", 
          "Mars", "Jupiter", "Saturn", 
          "Uranus", "Neptune")
type <- c("Terrestrial planet", 
          "Terrestrial planet", 
          "Terrestrial planet", 
          "Terrestrial planet", "Gas giant", 
          "Gas giant", "Gas giant", "Gas giant")
diameter <- c(0.382, 0.949, 1, 0.532, 
              11.209, 9.449, 4.007, 3.883)
rotation <- c(58.64, -243.02, 1, 1.03, 
              0.41, 0.43, -0.72, 0.67)
rings <- c(FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE)

# Create a data frame from the vectors
planets_df <- data.frame(name, type, diameter, rotation, rings)

# Select first 5 values of diameter column
planets_df[1:5, "diameter"]

# Select the rings variable from planets_df
rings_vector <- planets_df$rings

# Select planets with diameter < 1
subset(planets_df, subset = diameter <1)

Order the data

In data analysis you can sort your data according to a certain variable in the dataset. In R, this is done with the help of the function order().

order() is a function that gives you the ranked position of each element when it is applied on a variable, such as a vector for example:

a <- c(100, 10, 1000)
order(a)
[1] 2 1 3

10, which is the second element in a, is the smallest element, so 2 comes first in the output of order(a). 100, which is the first element in a is the second smallest element, so 1 comes second in the output of order(a).

This means we can use the output of order(a) to reshuffle a:

a[orde­r(a)]
[1] 10 100 1000
 

Matrices

Construct Matrix with 3 rows that contain the numbers 1 to 9
matrix­(1:9, byrow = TRUE, nrow = 3)
From Vector to Matrix
Matrix­_names <- matrix­(ve­cto­r_name, byrow = TRUE, nrow = 3)
Totals for each row of a matrix
rowSum­s(m­y_m­atrix)
Total for each row of a matrix
colSums()
Adding columns
big matrix <- cbind(­vec­tor1, matrix1)
Adding rows
rbind
Select all elements of the first column
matrix[,1]
Select all elements of the first row
matrix[1,]
Select 2nd element of 3rd column
matrix­[2,3]
Create matrix with the data on the rows 1, 2, 3 and columns 2, 3, 4.
matrix­[1:­3,2:4]
Average of the matrix elements
mean(m­atr­ix_­name)
Summary of Matrix (and other stuff)
summar­y(m­atr­ix_­name)
The argument byrow indicates that the matrix is filled by the rows. If we want the matrix to be filled by the columns, we just place byrow = FALSE

all data in a matrix should be of the same type. Otherwise, create a data frame

Naming a Matrix

# Box office Star Wars (in millions!)
new_hope <- c(460.998, 314.4)
empire_strikes <- c(290.475, 247.900)
return_jedi <- c(309.306, 165.8)

# Construct matrix
star_wars_matrix <- matrix(c(new_hope, empire_strikes, return_jedi), nrow = 3, byrow = TRUE)

# Vectors region and titles, used for naming
region <- c("US", "non-US")
titles <- c("A New Hope", "The Empire Strikes Back", "Return of the Jedi")

# Name the columns with region
colnames(star_wars_matrix) <- region

# Name the rows with titles
rownames(star_wars_matrix) <- titles

# Print out star_wars_matrix
star_wars_matrix