Programming

if()

The function if() evaluates an expression, and if it is true, it executes the given commands.

The expression to evaluate is inside ( ), and the commands to execute are inside { }.

For example:

a = 2
b = 0

if (a > 1) {
    b = 7
}

b
## [1] 7

Further, you can include multiple commands to execute.

if (a < 10) {
    b = 24
    c = 5
}

b
## [1] 24
c
## [1] 5

You can also include multiple statements to evaluate.

if (a > 0 & a < 10) {
    print("a is between 0 and 10!")
}
## [1] "a is between 0 and 10!"

ifelse()

You can combine if() with else() to execute one command if the statement is true and another command if the statement is false.

a <- 20

if (a > 0 & a < 10) {
    print("a is between 0 and 10!")
} else {
    print("a is NOT between 0 and 10!")
}
## [1] "a is NOT between 0 and 10!"

Note: else() must come on the same line as the } of the if() statement. It not, it will fail:

if(a > 0 & a < 10) {
  print("a is between 0 and 10!")
}  
else {
  print("a is NOT between 0 and 10!")
}

Loops and repeats

An index i, is assigned a sequence of values, each the result of other lines of code, as many times as there are values of i.

For example, to print the values of i from 1 to 5, we run

for (i in 1:5) print(i)
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5

We can modify the output as it goes:

for (i in 1:5) print(i^2)
## [1] 1
## [1] 4
## [1] 9
## [1] 16
## [1] 25

As with functions, for multiple lines of code, we need to use curly brackets { } to enclose the loop. Further, note that the variable used here, i, does not have to be utilized inside the { }.

j = k = 0

for (i in 1:5) {
    j = j + 1
    k = k + j
    print(j + k)
}
## [1] 2
## [1] 5
## [1] 9
## [1] 14
## [1] 20

example loop

To calculate the number of individuals of each species, using the indices dataset.

indices <- read.table("../data/indices.txt", header = TRUE)

# create empty vector for the results
total.sp <- vector()

# create a vector of species names
spp <- unique(indices$species)

# run a loop through every species name, summing the number of individuals
# of that species
for (i in 1:length(spp)) total.sp[i] = sum(indices$individuals[indices$species == 
    spp[i]])

total.sp
## [1]  9 11 13 13

However, note that R has many built in functions that are more efficient than writing loops.

For example, we could calculate the number of individuals per species using 'tapply' instead:

tapply(indices$individuals, indices$species, sum)
##  a  b  c  d 
##  9 11 13 13

while()

test.while <- function(x) {
    t = x
    while (t > 3) {
        t = t - 2
        print(t)
    }
}

test.while(14)
## [1] 12
## [1] 10
## [1] 8
## [1] 6
## [1] 4
## [1] 2

repeat()

test.repeat <- function(x) {
    t <- x
    repeat {
        if (t < 2) 
            break
        t = t - 3
        print(t)
    }
}  # end of function

infinite loops...

Because the repeat() function contains no explicit limit, you need to be careful not to program an infinite loop. You must have a logical escape clause that leads to a break command:

The next function uses while() to generate the Fibonacci series 1, 1, 2, 3, 5, 8, in which each term is the sum of its two predecessors. The key point about while() loops is that the logical variable controlling their operation is altered inside the loop. In this example, we alter n, the number whose Fibonacci number we want, starting at n, reducing the value of n by 1 each time around the loop, and ending when n gets down to 0.

fibonacci <- function(n) {
    a = 1
    b = 0
    while (n > 0) {
        swap = a
        a = a + b
        b = swap
        n = n - 1
    }
    b
}

fibonacci(10)
## [1] 55

An important general point about computing involves the use of the swap variable above. When we replace a by a + b on line 6 we lose the original value of a. If we had not stored this value in swap, we could not set the new value of b to the old value of a on line 7. We can test the function by generating the Fibonacci numbers 1 to 10, using sapply().

sapply(1:10, fibonacci)
##  [1]  1  1  2  3  5  8 13 21 34 55

Avoiding loops

It is good (R) programming practice to avoid using loops wherever possible. The use of vector functions makes this particularly straightforward in many cases. (And apply(), tapply(), and sapply() are useful for avoiding loops for more complicated cases).

Here is a simple case. Suppose that you wanted to replace all of the negative values in a vector by zeros. In the old days, you might have written something like this:

# original vector 'y'
y <- c(-2, -3, 1, 4, 5, -7, 23, 13, -9)
y
## [1] -2 -3  1  4  5 -7 23 13 -9

# loop through each value and change the negative values to 0:
for (i in 1:length(y)) {
    if (y[i] < 0) 
        y[i] <- 0
}

y
## [1]  0  0  1  4  5  0 23 13  0

Now, however, you would use logical subscripts like this:

y <- c(-2, -3, 1, 4, 5, -7, 23, 13, -9)
y
## [1] -2 -3  1  4  5 -7 23 13 -9

# change negative values to 0 in one step
y[y < 0] <- 0
y
## [1]  0  0  1  4  5  0 23 13  0

Remember that it is not possible to undo commands in R!!!

So, when replacing values in a dataset, it is best to first create a new dataset, or save the original under a name, just to be safe.

# original data
y <- c(-2, -3, 1, 4, 5, -7, 23, 13, -9)

# save original data with a different name in case you need to revert to it
y.orig <- y

# perform action
y[y < 0] <- 0
y
## [1]  0  0  1  4  5  0 23 13  0

Or, merely read the data into R again and rerun your analyses.


Exercises