Cheatsheets

Books

Manuscripts

About the workshop

Organizers and instructors: Michael C Sachs, Alexander Ploner

Summary and Learning Objectives

R is a free, open-source statistical computing environment that is the language of choice for correct and reproducible analysis. R packages are the fundamental unit of reproducible R code for analysis and reporting. A good R package can have a big impact in scientific research, whether it is an implementation of a novel statistical method or an interface to existing analytic approaches. Researchers at the Karolinska Institute in the fields of molecular, genetic, and clinical epidemiology would benefit from more and better implementations of statistical and epidemiological methods in R. Students, post-docs, and other researchers who are interested in methods have the opportunity to meet this need and potentially make a dramatic impact in their field.

Aside from the mechanics of packaging R code and basic principles of software development, this 3-day course will focus on development of high quality packages and maximizing their impact. Through a series of examples from existing R packages, participants will learn about different strategies for designing and implementing interfaces to statistical and epidemiological methods. Then, we will summarize the steps one can take to maximize the impact of the R package and to obtain academic credit for oneโ€™s efforts.

Schedule

  • Day 1: Basics, covering the mechanics of packaging R code, modularity and the DRY principle, testing and documentation, version control.
  • Day 2: Interfaces, covering functions, operators, S3 classes and overloading operators, the pipe operator, and other types of classes (S4, RC).
  • Day 3: Interfaces continued, including Shiny graphical interface. Releasing R packages, covering Github, web pages, CRAN, Bioconductor, and publishing clinical or software papers describing the package.

Intended Audience

Researchers in biostatistics or epidemiology who have some experience using R and who are interested in methods development. Participants should have at least basic knowledge of R, including how to write functions. It is beneficial, but not necessary, if participants have at least a general idea for an R package that they wish to create or a method that they think should be implemented or improved.