Cheatography

# Introduction to R Cheat Sheet (DRAFT) by josi68

Intro to basics, Vectors, Matrices, Factors, Data Frames & Lists

This is a draft cheat sheet. It is a work in progress and is not finished yet.

### Arithm­etics

 Addition + Subtra­ction - Multip­lic­ation * Division / Modulo %% Expone­nti­ation ^
Modulo returns the remainder of the division of the number to the left by the number on its right, for example 5 modulo 3 or 5 %% 3 is 2.

### Comparison operators

 Less than < More than > Less than or equal to <= Greater than or equal to >= Equal to each other == Not equal to each other !=

### Selecting by comparison

 ``````# Poker and roulette winnings from Monday to Friday: poker_vector <- c(140, -50, 20, -120, 240) roulette_vector <- c(-24, -50, 100, -350, 10) days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday") names(poker_vector) <- days_vector names(roulette_vector) <- days_vector # Which days did you make money on roulette? selection_vector <- roulette_vector > 0 # Select from roulette_vector these days roulette_winning_days <- roulette_vector[selection_vector]``````

### Data Types

 Decimal values 4.5 Numerics Whole numbers 4 Integers Boolean values TRUE / FALSE Logical Text / String "­Tex­t" Characters
Show the data type: class(data)

### Lists

 Create a list my_list <- list(e­lem­ent1, element2) Give names to the list items my_list <- list(name1 = your_c­omp1, name2 = your_c­omp2)
# Adapt list() call to give the components names
my_list <- list(vec = my_vector,
mat = my_matrix,
df = my_df)

#or if the list was alread created
names(­my_­list) <- c("v­ec", "­mat­", "­df")

### Selecting components in a list

 One way to select a component is using the numbered position of that component. For example, to "­gra­b" the first component of shinin­g_list you type shinin­g_l­ist­[[1]] A quick way to check this out is typing it in the console. Important to remember: to select elements from vectors, you use single square brackets: [ ]. Don't mix them up! You can also refer to the names of the compon­ents, with [[ ]] or with the \$ sign. Both will select the data frame repres­enting the reviews: shinin­g_l­ist­[["r­evi­ews­"]] shinin­g_l­ist­\$re­views Besides selecting compon­ents, you often need to select specific elements out of these compon­ents. For example, with shinin­g_l­ist­[[2­]][1] you select from the second component, actors (shini­ng_­lis­t[[­2]]), the first element ([1]). When you type this in the console, you will see the answer is Jack Nicholson.

### Vector Basics

 Assign value to variable my_var <- 4 Numeric vector numeri­c_v­ector <- c(1, 10, 49) Charac­ter­_vector charac­ter­_vector <- c("a­", "­b", "­c") Boolean vector boolea­n_v­ector <- c(TRUE, FALSE, TRUE) Naming a vector names(­num­eri­c_v­ector) <- c("J­ack­", "­Jil­l", "­Joh­ann­a") Sum of the elements in the vector sum(ve­cto­r_name) Select element 3 of the vector element <- vector­_na­me[3] Select elements 2, 3, 4, 5 of the vector elements <- vector­_na­me[3:5]

### Factors

 ``````# Animals - Turn vector character elements into nominal factors animals_vector <- c("Elephant", "Giraffe", "Donkey", "Horse") factor_animals_vector <- factor(animals_vector) factor_animals_vector # Temperature - Turn vector character elements into ordinal factors temperature_vector <- c("High", "Low", "High","Low", "Medium") factor_temperature_vector <- factor(temperature_vector, order = TRUE, levels = c("Low", "Medium", "High")) factor_temperature_vector``````
When factors are ordinal::
order = TRUE

To give the order of the ordinal factors:
levels = c("L­ow", "­Med­ium­", "­Hig­h"))

### Data Frames

 Show the first couple of lines head(d­ata­_frame) Show the last couple of lines tail(d­ata­:frame) Summarize data frame (min, max, median, quartiles) summar­y(d­ata­_frame) Structure (nr. obs, var., names, data type...) str(da­ta_­frame)
unlike matrixes, df can have different types of data - BUT all variables need to have the same length (unlike for lists)

### Create data frame from vectors + select values

 ``````# Definition of vectors name <- c("Mercury", "Venus", "Earth",           "Mars", "Jupiter", "Saturn",           "Uranus", "Neptune") type <- c("Terrestrial planet",           "Terrestrial planet",           "Terrestrial planet",           "Terrestrial planet", "Gas giant",           "Gas giant", "Gas giant", "Gas giant") diameter <- c(0.382, 0.949, 1, 0.532,               11.209, 9.449, 4.007, 3.883) rotation <- c(58.64, -243.02, 1, 1.03,               0.41, 0.43, -0.72, 0.67) rings <- c(FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE) # Create a data frame from the vectors planets_df <- data.frame(name, type, diameter, rotation, rings) # Select first 5 values of diameter column planets_df[1:5, "diameter"] # Select the rings variable from planets_df rings_vector <- planets_df\$rings # Select planets with diameter < 1 subset(planets_df, subset = diameter <1)``````

### Order the data

 In data analysis you can sort your data according to a certain variable in the dataset. In R, this is done with the help of the function order(). order() is a function that gives you the ranked position of each element when it is applied on a variable, such as a vector for example: a <- c(100, 10, 1000) order(a) [1] 2 1 3 10, which is the second element in a, is the smallest element, so 2 comes first in the output of order(a). 100, which is the first element in a is the second smallest element, so 1 comes second in the output of order(a). This means we can use the output of order(a) to reshuffle a: a[orde­r(a)] [1] 10 100 1000

### Matrices

 Construct Matrix with 3 rows that contain the numbers 1 to 9 matrix­(1:9, byrow = TRUE, nrow = 3) From Vector to Matrix Matrix­_names <- matrix­(ve­cto­r_name, byrow = TRUE, nrow = 3) Totals for each row of a matrix rowSum­s(m­y_m­atrix) Total for each row of a matrix colSums() Adding columns big matrix <- cbind(­vec­tor1, matrix1) Adding rows rbind Select all elements of the first column matrix[,1] Select all elements of the first row matrix[1,] Select 2nd element of 3rd column matrix­[2,3] Create matrix with the data on the rows 1, 2, 3 and columns 2, 3, 4. matrix­[1:­3,2:4] Average of the matrix elements mean(m­atr­ix_­name) Summary of Matrix (and other stuff) summar­y(m­atr­ix_­name)
The argument byrow indicates that the matrix is filled by the rows. If we want the matrix to be filled by the columns, we just place byrow = FALSE

all data in a matrix should be of the same type. Otherwise, create a data frame

### Naming a Matrix

 ``````# Box office Star Wars (in millions!) new_hope <- c(460.998, 314.4) empire_strikes <- c(290.475, 247.900) return_jedi <- c(309.306, 165.8) # Construct matrix star_wars_matrix <- matrix(c(new_hope, empire_strikes, return_jedi), nrow = 3, byrow = TRUE) # Vectors region and titles, used for naming region <- c("US", "non-US") titles <- c("A New Hope", "The Empire Strikes Back", "Return of the Jedi") # Name the columns with region colnames(star_wars_matrix) <- region # Name the rows with titles rownames(star_wars_matrix) <- titles # Print out star_wars_matrix star_wars_matrix``````