Bugs!

  1. Crash: function call exits with error

  2. Incorrect behavior

  3. Unexpected behavior

library(MASS)
data(Pima.te)

## Crash
subset( head(Pima.te), npreg > a )
Error in eval(e, x, parent.frame()): object 'a' not found

## Incorrect behavior
subset( head(Pima.te), type = "yes" & npreg > 2 )
  npreg glu bp skin  bmi   ped age type
1     6 148 72   35 33.6 0.627  50  Yes
2     1  85 66   29 26.6 0.351  31   No
3     1  89 66   23 28.1 0.167  21   No
4     3  78 50   32 31.0 0.248  26  Yes
5     2 197 70   45 30.5 0.158  53  Yes
6     5 166 72   19 25.8 0.587  51  Yes

## Unexpected behavior
subset( head(Pima.te), type == "yes" & npreg > 2 )
[1] npreg glu   bp    skin  bmi   ped   age   type 
<0 rows> (or 0-length row.names)

Tools

traceback to find where error occurred:

library(MASS)
data(Pima.te)
subset(Pima.te, npreg > a)
traceback()

debug to execute code line by line:

browser to set stop points when running code:

Rstudio Debug menu: similar fucntionality, but can only be started for R scripts

Example

f1 = function(x, y, flag = TRUE, n = 1000) 
{
  if ( missing(y) ) 
  {
    if (flag | (x > 0)) y = abs(x)
  }
  g = function(z) z + x
  acc = 0
  ## What happened to ?cumsum 
  for (i in 1:1000) acc = acc + runif(1) 
  g(y)
}

Run tests:

f1(1)
f1(-1)
f1(1, flag = FALSE)
f1(-1, flag = FALSE)
  1. Run traceback (not very useful)

  2. Use debug to run the critical case line-by-line as far as practical

  3. Note how the focus jumps between editor and console; editor has some decent Debug menu items, but inspection of local vars (e.g. ls()) is in the console

  4. Add a browser() statement that allows you to pass over the loop (or use the menu item)

  5. Fix the function, remove the browser if necessary & undebug

Minimal replicable example

Simplify any example that generates an error

Smallest self-contained set of data & code that generates the error reliably

Exercise: nestedCC

Small package nestedCC on github.com/alexploner/nestedCC

  1. Reminder: nested case-control studies?
  2. Use menu New project to clone the repository locally
  3. Load the package, and run example(nestedCC) to see how the function works
  4. Look at the small example data set cohort_test in the package
  5. Run the command
nestedCC(cohort_test, event = "event", exit = "time", match = "sex", seed=41)

Look closely at the results, and try to find the error. As the code is not totally obvious, step slowly through it using debug, to understand what is happening.

Solution: branch fix_sample in the repository has a fixed version of the function.