Write a tidy method for our mean_sd function and try it out on the penguins dataset.
library(palmerpenguins)library(broom)#Write a tidy method for our mean_sd function and try it out on the penguins dataset.mean_sd <-function(x, na.rm =TRUE) { xname <-deparse1(substitute(x)) # gets the name of the variable res <-c(mean =mean(x, na.rm = na.rm), sd =sd(x, na.rm = na.rm))class(res) <-"meansd"attr(res, "variable") <- xnameattr(res, "sampsize") <-length(x) res}print.meansd <-function(x, digits =2) {cat(attr(x, "variable"), ": ",paste0(round(x["mean"], digits = digits), " (", round(x["sd"], digits = digits), ")\n"))}tidy.meansd <-function(x) {data.frame(variable =attr(x, "variable"), mean = x["mean"], sd = x["sd"], sample.size =attr(x, "sampsize") )}mean_sd(penguins$body_mass_g)
penguins$body_mass_g : 4201.75 (801.95)
tidy(mean_sd(penguins$body_mass_g))
variable mean sd sample.size
mean penguins$body_mass_g 4201.754 801.9545 344
Apply the mean_sd function to the penguins body mass in grams by species and sex. Organize the results into a table suitable for publication, where it is easy to compare the two sexes.
Load the LPR data example from "https://sachsmc.github.io/r-programming/data/lpr-ex.rds"
library(here)
here() starts at /home/micsac/Teaching/Courses/r-programming
lpr <-readRDS(here("data", "lpr-ex.rds"))
Reshape the data into wide, where the columns are the primary diagnosis (hdia) at each visit number
Reshape the data into longer format, where all of the diagnoses are stored in a single variable, with another variable indicating the primary diagnosis.
Create a new variable for each participant which equals TRUE if they had any diagnosis of either D150, D152, or D159 before the date 1 January 2010.
pid age sex indat
Length:8046 Min. :36.50 Length:8046 Min. :2005-01-02
Class :character 1st Qu.:63.50 Class :character 1st Qu.:2007-09-23
Mode :character Median :68.00 Mode :character Median :2010-06-01
Mean :68.43 Mean :2010-06-15
3rd Qu.:77.00 3rd Qu.:2013-03-01
Max. :90.50 Max. :2015-12-31
visit diagnum diag id
Min. : 1.000 Min. :0.000 Length:8046 Min. : 1.0
1st Qu.: 2.000 1st Qu.:1.000 Class :character 1st Qu.: 495.2
Median : 3.000 Median :2.000 Mode :character Median : 997.0
Mean : 3.466 Mean :1.791 Mean : 996.5
3rd Qu.: 5.000 3rd Qu.:3.000 3rd Qu.:1494.0
Max. :13.000 Max. :6.000 Max. :2008.0
maindiag ddiag_pre2010
Mode :logical Mode :logical
FALSE:6038 FALSE:8032
TRUE :2008 TRUE :14
pid age sex indat
Length:8046 Min. :36.50 Length:8046 Min. :2005-01-02
Class :character 1st Qu.:63.50 Class :character 1st Qu.:2007-09-23
Mode :character Median :68.00 Mode :character Median :2010-06-01
Mean :68.43 Mean :2010-06-15
3rd Qu.:77.00 3rd Qu.:2013-03-01
Max. :90.50 Max. :2015-12-31
visit name diag maindiag
Min. : 1.000 Length:8046 Length:8046 Mode :logical
1st Qu.: 2.000 Class :character Class :character FALSE:6038
Median : 3.000 Mode :character Mode :character TRUE :2008
Mean : 3.466
3rd Qu.: 5.000
Max. :13.000
ddiag_pre2010
Mode :logical
FALSE:8032
TRUE :14