Introduction and Overview

Day 1

Michael C Sachs

Introduction

Learning objectives

  • Understand and apply programming principles to handle repetitive tasks
  • Understand how to use and make your own functions in R
  • Use loops in R
  • Apply these principles to perform efficient data manipulation, analysis, and reporting

Why?

If your research involves any data, it likely also involves:

  • Data management and manipulation
  • Analysis
  • Reporting of results

This means lots of code.

We can borrow some ideas from software development to

  1. Make our lives easier – repetitive tasks, updates with new data
  2. Make science better – reproducible, understandable, reusable

Goals

At a bare minimum, we want our code to be:

Organized

  • Coherent structure of
    • folders, files, and scripts

Transportable

  • Works on
    • my future computer
    • other people’s computers

Understandable

  • Clearly specified dependencies
  • Readable code, sensible naming of things
  • Written documentation explaining what and how

Extras

Nice to have, but not absolutely necessary

Provably correct

Through automated testing

  • Pieces of code correctly do what they should
  • Code works together correctly, not sensitive to minor changes in the data
  • Valid statistical properties

Version controlled

  • History of development is tracked and documented

Overview of this course

Lectures

Lessons

Work through a problem together, learn tools and strategies

Sticky note system

red note = “I need help!”

green note = “Good to go!”

No sticky = still working

Share solutions/questions/responses on padlet:

https://padlet.com/sachsmc/rprogs24/

Course website:

https://sachsmc.github.io/r-programming/

Schedule of topics

Today

  1. Reporting, dynamic documents
  2. Data structures and indexing

Thursday

  1. Flow control and loops
  2. Creating and using functions

Next Monday

  1. Functions continued
  2. Working with data, merging and reshaping

Next Thursday

  1. Data visualization
  2. Working with data, dates and characters

About me

What I do

  • Develop and evaluate statistical methods
    • Write R packages
    • Write R code to test the methods
  • Collaborate on research projects involving data analysis
    • Use register and other data from stats Denmark
    • Data analysis, visualization and reporting

My R philosophy

  • I started learning R in 2005 (version 2.2)
  • There is almost no problem that can’t be solved with R
    • Flexible and dynamic
    • Plays nicely with almost any other programming language

Course philosophy

  • There are no stupid questions
  • There are many ways to solve a problem, there is no single “right way”
  • I will not force you to learn any particular way, e.g., tidyverse vs data.table, we will focus on learning the general principles.
    • Try them out, then choose one and get good at it
    • Same with project organization, choose a system and stick with it
  • Show up to class and participate, you will pass and hopefully learn something useful

Take notes and practice

To get credit for the course, you must attend at least 80% of the course.

I encourage you to work on the exercises and take notes as we proceed.

  1. Use quarto to blend code, output, and prose.
  2. Demonstrate mastery of at least 3 out of the 4 learning objectives.
  3. Review the previous day’s material and build on it in the next day.

Office hours and independent study

  • This is the condensed version of the course. Lectures and lessons from 8 - 12 each day. Sometimes it will seem fast.
  • You will need to spend some time on your own reviewing the material and practicing

Every day after class (from 12 - 14), I will hold office hours where you can come ask questions and get help on any R topic

My office is CSS 10.2.25

Interactive slides

Slides will occasionally allow you to interact with the code. Follow along with lectures and use it for short quizzes. Try it out:

Questions?

Link home