## 2. The %% operator is the modulo function.## i %% 2 attempts to divide i by 2, with ## whole numbers only and returns the remainder. for(i in1:10){if((i %%2) ==0) {print(i) }}
[1] 2
[1] 4
[1] 6
[1] 8
[1] 10
Loops for statistical analysis
library(palmerpenguins)
Write a loop that calculates and prints out the mean for each numeric variable in the penguins dataset
for(i in1:ncol(penguins)) {## cat prints to the console without quotescat(names(penguins)[i], " - ", sep ="") vari <- penguins[[i]] ## this iterates through the columns of penguinsif(is.numeric(vari)) { ## check whether it is numericcat("Mean:", mean(vari, na.rm =TRUE)) } else {cat("not numeric") }cat("\n") # new line}
species - not numeric
island - not numeric
bill_length_mm - Mean: 43.92193
bill_depth_mm - Mean: 17.15117
flipper_length_mm - Mean: 200.9152
body_mass_g - Mean: 4201.754
sex - not numeric
year - Mean: 2008.029
Modify your loop in 1 so that it prints out the mean, standard deviation, median, and interquartile range for each numeric variable in penguins.
f.names <-c("mean", "std dev", "median", "interquartile range")functions <-list(mean, sd, median, IQR)## we will add an inner loop that iterates through these function and namesfor(i in1:ncol(penguins)) {cat(names(penguins)[i], ":\n", sep ="") vari <- penguins[[i]]if(is.numeric(vari)) { for(fn in1:length(f.names)) {cat(f.names[fn], functions[[fn]](vari, na.rm =TRUE), "\n") } } else {cat("not numeric\n") }cat("--------\n") # new line}
species:
not numeric
--------
island:
not numeric
--------
bill_length_mm:
mean 43.92193
std dev 5.459584
median 44.45
interquartile range 9.275
--------
bill_depth_mm:
mean 17.15117
std dev 1.974793
median 17.3
interquartile range 3.1
--------
flipper_length_mm:
mean 200.9152
std dev 14.06171
median 197
interquartile range 23
--------
body_mass_g:
mean 4201.754
std dev 801.9545
median 4050
interquartile range 1200
--------
sex:
not numeric
--------
year:
mean 2008.029
std dev 0.8183559
median 2008
interquartile range 2
--------
Write a loop to compute 500 bootstrap replicates of the means of bill length, bill depth, and flipper length. Remember to pre-allocate a data structure to store the 500 times 3 values. Provide an estimate of the correlation of the sample means.
bill_length bill_depth flipper_length
Min. :43.11 Min. :16.78 Min. :198.1
1st Qu.:43.72 1st Qu.:17.07 1st Qu.:200.4
Median :43.92 Median :17.14 Median :200.9
Mean :43.92 Mean :17.15 Mean :200.9
3rd Qu.:44.12 3rd Qu.:17.22 3rd Qu.:201.4
Max. :44.88 Max. :17.49 Max. :203.1
Loops are sometimes unavoidable if a calculation depends on the value at one or more of the previous iterations.
One way to compute the Kaplan-Meier curve for right censored data is to loop through the death times and accumulate the product of 1 minus the number of deaths at each time over the number at risk at that time. Complete the following code to compute the KM curve and compare to the result from the survival package.
Write a loop that contains an if then else statement that goes through the variables in penguins and replaces missing values with the mean for numeric double variables, and the most frequent value for characters or factors.
species island bill_length_mm bill_depth_mm
Adelie :152 Biscoe :168 Min. :32.10 Min. :13.10
Chinstrap: 68 Dream :124 1st Qu.:39.27 1st Qu.:15.60
Gentoo :124 Torgersen: 52 Median :44.25 Median :17.30
Mean :43.92 Mean :17.15
3rd Qu.:48.50 3rd Qu.:18.70
Max. :59.60 Max. :21.50
flipper_length_mm body_mass_g sex year
Min. :172.0 Min. :2700 female:165 Min. :2007
1st Qu.:190.0 1st Qu.:3550 male :179 1st Qu.:2007
Median :197.0 Median :4025 Median :2008
Mean :200.9 Mean :4199 Mean :2008
3rd Qu.:213.0 3rd Qu.:4750 3rd Qu.:2009
Max. :231.0 Max. :6300 Max. :2009