Background

Types of functions

Almost everything in R is a function. There are several types of functions, the most familiar type is called using the prefix form:

> help("function")

There are also infix operators, and these are also functions. You can call them also using the prefix form by enclosing the operator in backticks.

> 1 + 2
[1] 3
> `+`(1, 2)
[1] 3
> #help("+")

Replacement functions assign values

> x <- 1
> x
[1] 1
> 
> `<-`(x, 2)
> x
[1] 2

Special functions include things like indexing [, [[, and control flow if and for.

An anonymous function is a function that is defined inline, usually passed to another function, without assigning it a name. You may have used them in lapply and variants:

> x <- list(1:3, 4:5)
> lapply(x, function(z) z[2])
[[1]]
[1] 2

[[2]]
[1] 5

Now that you know that indexing is itself a function, can you perform the same operation without using an anonymous function?

> lapply(x, `[`, 2)
[[1]]
[1] 2

[[2]]
[1] 5

Finding and inspecting functions

The help file for a function can be found by calling help or using the ? operator

> help("sample")
> ?sample

The body of a function can be printed to the console by typing the name of the function without parentheses:

> sample
function (x, size, replace = FALSE, prob = NULL) 
{
    if (length(x) == 1L && is.numeric(x) && is.finite(x) && x >= 
        1) {
        if (missing(size)) 
            size <- x
        sample.int(x, size, replace, prob)
    }
    else {
        if (missing(size)) 
            size <- length(x)
        x[sample.int(length(x), size, replace, prob)]
    }
}
<bytecode: 0x559a627b2ed8>
<environment: namespace:base>

The environment field at the end of the function body tells you where the function lives, i.e., in what package. You can call functions by typing the package name, ::, then the function name:

> base::sample
function (x, size, replace = FALSE, prob = NULL) 
{
    if (length(x) == 1L && is.numeric(x) && is.finite(x) && x >= 
        1) {
        if (missing(size)) 
            size <- x
        sample.int(x, size, replace, prob)
    }
    else {
        if (missing(size)) 
            size <- length(x)
        x[sample.int(length(x), size, replace, prob)]
    }
}
<bytecode: 0x559a627b2ed8>
<environment: namespace:base>
> base::sample(1:10, 1)
[1] 9

Function arguments

What are the arguments of the sample function? How can we find out?

> #help(sample)
> formals(sample)
$x


$size


$replace
[1] FALSE

$prob
NULL
> args(sample)
function (x, size, replace = FALSE, prob = NULL) 
NULL
> str(sample)
function (x, size, replace = FALSE, prob = NULL)  

Which arguments have default values? What are they?

When you call a function, arguments are matched by:

  1. Exact name
  2. Partial name
  3. Position

in that order. The following calls are equivalent, which one is easiest to understand?

> x <- 1:4
> 
> set.seed(123)
> sample(x = x, size = 2, replace = FALSE)
[1] 3 4
> 
> set.seed(123)
> sample(x, si = 2, rep = FALSE)
[1] 3 4
> 
> set.seed(123)
> sample(replace = FALSE, size = 2, x = x)
[1] 3 4
> 
> set.seed(123)
> sample(x, 2)
[1] 3 4

Elipsis as a function argument

... is a possible function argument. It refers to an arbitrary list of expressions. Here are some possible uses:

  1. Passing arguments to other functions. If your function calls another function that may take arguments, they can be passed directly using ... instead of listing the names of the arguments in your function. An example is lapply. Why is the ... necessary in this case?
> formals(lapply)
$X


$FUN


$...

Other examples are c, rbind, and cbind. Why do these functions use ...?

Lazy evaluation

R does not evaluate function arguments until they are used in the function body, this is called lazy evaluation. This is why R does not check for missing arguments, an error wonโ€™t occur until the argument is used in the function.

> h01 <- function(x) {
+     
+     "Hello world!"
+     
+ }
> 
> h01()
[1] "Hello world!"
> h01(stop("Error"))
[1] "Hello world!"

One way to manually check for arguments is with missing:

> h02 <- function(x) {
+     
+     if(missing(x)) {
+         return("Missing x!")
+     }
+     "Hello world!"
+     
+ }
> 
> h02()
[1] "Missing x!"
> h02(1)
[1] "Hello world!"

Intro to lexical scoping

An environment can be thought of as a collection of objects. When you are working at the console, you are working in the global enviroment. You can view the names of objects in an environment with ls, which lists the objects in the current environment by default:

> ls()
[1] "h01" "h02" "x"  
> 
> hello <- "Hello world"
> 
> ls()
[1] "h01"   "h02"   "hello" "x"    

When a function is invoked, it is called inside its own environment that contains the arguments and things defined in the function body:

> f01 <- function(y) {
+     
+     x <- 1
+     ls()
+     
+ }
> 
> f01("a")
[1] "x" "y"

The code in the function body does things with the objects, how does R find things, and in which environments does it look? This is the concept of lexical scoping. When a function is called, R follows simple rules to find things:

  1. Look in the current function environment first, that is the environment where the function was defined.
  2. If not found, look in the parent environment.
  3. Repeat 2 until there are no more parents.
  4. If not found, throw an error.

What do the following functions return?

> x <- 1
> f02 <- function(y = 2) {
+     
+     y + x
+     
+ }
> 
> f02()
[1] 3
> 
> y <- 1
> f03 <- function() {
+     
+     y <- 2
+     i <- function() {
+         
+         z <- 3
+         c(x, y, z)
+     }
+     
+     i()
+ }
> 
> 
> f03()
[1] 1 2 3

Exercises

  1. Define your own infix function that concatenates two strings. Can this operator be used to concatenate more than 2 strings? How does that work? Write the function call as prefix form to show how.
> `% %` <- function(a, b) {
+     
+     paste(a, b)
+     
+ }
> 
> "my" % % "name"
[1] "my name"
> "my" % % "name" % % "is" % % "Mike"
[1] "my name is Mike"
> 
> `% %`(`% %`(`% %`("my", "name"), "is"), "Mike")
[1] "my name is Mike"
  1. Define your own replacement function that replaces the second element of a vector.
> `second<-` <- function(x, value){
+     
+     x[2] <- value
+     x
+     
+ }
> 
> x <- 1:10
> second(x) <- 11
> x
 [1]  1 11  3  4  5  6  7  8  9 10

2b. Make a more general function that replaces the i th element of a vector. Illustrate how the replacement function works when composed.

> `modify<-` <- function(x, i, value) {
+     
+     x[i] <- value
+     x
+     
+ }
> 
> x <- 1:10
> modify(x, 2) <- modify(x, 1) <- 11
> x
 [1] 11 11  3  4  5  6  7  8  9 10
> 
> x <- 1:10
> `modify<-`(`modify<-`(x, 2, 11), 1, 11)
 [1] 11 11  3  4  5  6  7  8  9 10
  1. Look at the function sample. What does sample do when size is not supplied? How would you rewrite this function to make it more clear to users that size is not necessary?