In R the use of functions allows the user to easily extend, modify and manipulate objects and analyses. A function is part of a computer programme that performs a specific action, but is not itself a programme. R packages generally contain functions that carry out specific actions for a given type of analysis.
We have already used a number of functions, e.g. mean()
, sd()
, plot()
etc.
Why might you want to write your own function?
Generally, if you find yourself writing the same code a few times, it will be worthwhile to try and create a function.
Functions in R take zero or more inputs (arguments), performs some actions on these inputs, and returns an output.
The basic template for an R function is
function_name <- function(function_argument1, function_argument2){
function_body
function_return_value
}
For example
Test.function1 <- function(a, b) {
# function body
ab.prod <- a * b
ab.sum <- a + b
prod.sum = ab.prod - ab.sum
# return statement, tells the function what value you want to output
return(prod.sum)
}
Let's go through each part of a function.
The function name can be anything you like, but something sensible that informs what the function does is sensible. As with object names, it is possible to overwrite exisiting objects and functions, so take care.
Once defined, you call the function as normal: function_name()
:
Test.function1(a = 2, b = 3)
## [1] 1
Test.function1(a = 7, b = 2)
## [1] 5
As normal, you can save the output of the function as an object:
a4.b5 <- Test.function1(a = 4, b = 5)
a4.b5
## [1] 11
If we call the function without parentheses, we can see the function definition itself:
Test.function1
## function(a, b) {
##
## # function body
## ab.prod <- a * b
## ab.sum <- a + b
## prod.sum = ab.prod - ab.sum
##
## # return statement, tells the function what value you want to output
## return(prod.sum)
## }
To tell R that you are writing a function, you need to inform R that the new object has a class function
The arguments to a function tell R what to run the function on, what kind of actions to perform, or anything else.
In Test.function1()
, the function arguments provide the data.
Other options include:
Sometimes a function is used as a convenience and it always does the same thing, so input is not important. An example might be the ubiquitous world" example from just about any computer science book.
hello.world <- function() {
print("hello world")
}
hello.world()
## [1] "hello world"
For functions with only one line, you can leave out the curly braces { }
.
hello.world <- function() print("hello world")
hello.world()
## [1] "hello world"
We could personalize this function, using an argument for the name. Here we call another function within our function: paste()
.
hello.someone <- function(name) {
print(paste("hello ", name))
}
hello.someone("fred")
## [1] "hello fred"
What happens if you try hello.someone()
without an argument?
hello.someone()
## Error: argument "name" is missing, with no default
R returns an error - we should have a sensible default. We can define these within the parentheses when we define the function, using argument = default
.
hello.someone <- function(name = "world") {
print(paste("hello ", name))
}
hello.someone()
## [1] "hello world"
We define a function for simulating n random numbers from a normal distribution with a mean of 10 and standard deviation of 5, and then calculating the sum of those numbers.
sim.t <- function(n, mu = 10, sigma = 5) {
X <- rnorm(n, mu, sigma)
return(sum(X))
}
There are numerous ways that we can call this function:
sim.t(4) # using defaults
sim.t(4, 3, 10) # n = 4, mu = 3, sigma = 10
sim.t(4, 5) # n = 4, mu = 5, sigma the default 5
sim.t(4, sigma = 100) # n = 4, mu the default 10, sigma = 100
sim.t(4, sigma = 100, mu = 1) # named arguments don't need to be in order
Using named arguments, such as sim.t(4, sigma = 100, mu = 1)
allows you to switch the order and avoid specifying all the values. For arguments with lots of variables this is very convenient.
Within a function, , ...
takes these values and passes them to an internal function, especially useful with graphics.
plot.f <- function(f, a, b, ...) {
xvals <- seq(a, b, length = 100)
plot(xvals, f(xvals), type = "l", ...)
}
This code will plot the sine curve from 0 to 2*pi:
plot.f(f = sin, a = 0, b = 2 * pi)
Fig. Plot of sine curve.
Because we included , ...
, we can easily modify the plot without changing the function.
plot.f(f = sin, a = 0, b = 2 * pi, lty = 4)
Fig. Plot of sine curve with different line type.
we could not do this if , ...
was not an argument in the function.
plot.f <- function(f, a, b) {
xvals <- seq(a, b, length = 100)
plot(xvals, f(xvals), type = "l", ...)
}
plot.f(f = sin, a = 0, b = 2 * pi, lty = 4)
## Error: unused argument (lty = 4)
default arguments and lazy evaluation in R