Day 2, A
Normally the code gets run line by line, in order, top to bottom
There are special commands that allow you change that
Conditional execution, choose which code to run depending on logical conditions
if(<condition>)
else if
else
Loops, repeat a chunk of code several times
repeat
for
while
If the <condition>
evaluates to TRUE
, then something
gets executed, otherwise something else
gets executed.
<condition>
must be length one. No vectors allowed.
brackets are helpful for understanding and clarity
There does not need to be anything returned from the expression
ifelse
This is a function, not a statement like if
and else
.
It is vectorized and hence better suited for working with data.
ifelse(<logical vector>, <yes vector>, <no vector>)
. All three vectors should be the same length or recycling happens.
It returns a vector with elements from yes
when TRUE, and elements from no
when FALSE.
A loop repeatedly and sequentially evaluates an expression, i.e.,
A loop will continue forever unless you tell it to stop. You tell it to stop in different ways for different loop expressions
This is the simplest of loops, it will repeat an expression until it encounters break
This will run forever
This will run exactly once
This will run 5 times
Notice the pattern, repeat an expression until a condition is met.
The condition usually depends on a variable that changes at each iteration, in this case i
, the iterator
while
loops explicitly state the condition at the start:
A for
loop explicitly states the sequence of the iterator at the start. Then the “end condition” is that the loop has reached the end of the sequence.
In this case, it is the most concise. I personally use for
loops more than any other loop.
break
can be used inside any loop to end it. next
can be used to go to the next iteration.
library(palmerpenguins)
species_names <- levels(penguins$species)
mean_bill_length <- numeric(length(species_names))
names(mean_bill_length) <- species_names
for(i in species_names){ ## iterator is a character
mean_bill_length[i] <- ## indexing by name
mean(subset(penguins, species == i)$bill_length_mm, na.rm = TRUE)
}
mean_bill_length
Adelie Chinstrap Gentoo
38.79139 48.83382 47.50488
mu.body_mass <- mean(penguins$body_mass_g, na.rm = TRUE)
bsmeans <- vector("numeric", length = 2000)
for(i in 1:length(bsmeans)) {
resampled.body_mass <- sample(penguins$body_mass_g, replace = TRUE)
bsmeans[i] <- mean(resampled.body_mass, na.rm = TRUE)
}
hist(bsmeans)
abline(v = mu.body_mass, col = "red")
A loop can itself contain a loop, or multiple loops.
species_names <- levels(penguins$species)
island_names <- levels(penguins$island)
mean_bm_matrix <- matrix(NA, nrow = length(species_names),
ncol = length(island_names),
dimnames = list(species_names, island_names))
for(i in species_names) {
for(j in island_names) {
thisset <- subset(penguins, species == i &
island == j)
if(nrow(thisset) == 0) next
mean_bm_matrix[i, j] <- mean(thisset$body_mass_g, na.rm = TRUE)
}
}
mean_bm_matrix
Biscoe Dream Torgersen
Adelie 3709.659 3688.393 3706.373
Chinstrap NA 3733.088 NA
Gentoo 5076.016 NA NA
You may see people warn you not to use for loops in R, “because they are slow”.
That is partially true, but speed is not the only thing, for loops can be much clearer and more understandable than the alternatives.
But, the slow thing in R is changing the size of an object, so you can avoid that by creating a vector/matrix/array of the correct size to hold the results of the loop: