Loops in R

## Warning in file(filename, "r", encoding = encoding): URL
## 'https://metrics.rstudioprimers.com/learnr/installClient': status was 'Couldn't
## resolve host name'

[1] “Warning: An error occurred with the client code.”

repeat

Many Data Science tasks require you to do the same thing over and over again. This is boring work—but not for your computer!

But how do you ask your computer to repeat a task?

The purrr package provides one way. I recommend that you check out the Iteration primer to learn how to use purrr and its map functions.
This tutorial will show you another way. You will learn how to repeat tasks with R’s loop functions. You know know enough to understand loops, and loops will expand your ability to write useful functions.

Did you know?

Did you know that some languages, like C, C++ and python, use loops as an all purpose programming tool. This approach will not work well in R.

R’s loops are the right tool for a very specific job, but they lose out to other methods for other jobs. This is due to R’s user-oriented design: most R functions implement their own pre-optimized loops behind the scenes in a lower level language. These built-in loops are much faster than any loop you could write at the command line.

But don’t let this drive you loopy! You’ll learn when and when not to use loops in R if you read through to the end of When to use loops. But first, we have a more pressing topic:

What is a loop anyways?

loops

R contains three types of loops

repeat
while
for

The simplest of these are repeat loops.

repeat

repeat repeats an expression over and over again. To use repeat, type repeat without trailing parentheses. Then use braces to enclose one or more lines of code, e.g.

repeat {
  print("Hello")
}

## Hello
## Hello
## Hello
## Hello
## Hello
## ...

repeat will execute all of the code between the braces. Then it will execute it again. And again. And again…

break

When an R loop encounters the break command, it exits the loop. This lets you schedule the end of a repeat loop.

How many times will R repeat the code in the loop below? Make a prediction then Click Submit Answer to see if you are right.

n <- 1
repeat {
  print(n)
  if (n == 5) break
  n <- n + 1
}

n <- 1
repeat {
  print(n)
  if (n == 5) break
  n <- n + 1
}

Here is a good question:

What happens to the value of n when you run the loop?

n is defined as 1 outside of the loop. Is n still 1 when the loop is finished?

n <- 1
repeat {
  print(n)
  if (n == 5) break
  n <- n + 1
}
n

Click Submit Answer to find out.

count

One way to prevent loops from altering objects in your environment is to put the loop into a function. For example, here is a function that counts from one to the number x. Do you see how it works?

count <- function(x) {
  n <- 1
  repeat {
    print(n)
    if (n == x) break
    n <- n + 1
  }
}

Can you write something similar? Write a function named count_down that counts from x to one. Be sure to arrange for the loop to end! Then click Submit Answer.

count_down <- function(x) {

  
  
  
  
  
}

"Begin by setting n = x. In which direction should you increment n to get to one?"

"Increment n by subtracting one from it on each repetition. You will want to do this _after_ you print the value of n for that repetition."

"Arrange to break the loop after you print n when n = 1."

count_down <- function(x) {
  n <- x
  repeat {
    print(n)
    if (n == 1) break
    n <- n - 1
  }
}

return

break is not the only way to end a loop in R.

If your loop is in a function, you can end the loop with a return() statement. R will stop executing the function (and therefore the loop) and return the value supplied by return().

Alter your function below to end the loop with return("Blast off!"). You will no longer need the break command. Then click Submit Answer.

count_down <- function(x) {
  n <- x
  repeat {
    print(n)
    if (n == 1) break
    n <- n - 1
  }
}

"Replace `break` with something else."

count_down <- function(x) {
  n <- x
  repeat {
    print(n)
    if (n == 1) return("Blast off!")
    n <- n - 1
  }
}

count_down(10)

## [1] 10
## [1] 9
## [1] 8
## [1] 7
## [1] 6
## [1] 5
## [1] 4
## [1] 3
## [1] 2
## [1] 1

## [1] "Blast off!"

Congratulations!

A loop executes a piece of code multiple times, perhaps allowing changes to accumulate as it goes.

You’ve learned the essence of loops, now it is time to learn the forms. In the next sections, you will learn how to refine repeat loops into two common types of loops: 1. while loops, and 1. for loops

Each of these is a repeat loop adapted to a specific task.

Along the way, you will work through a simple project: you will write and then refine a function that checks whether a number is prime. (This is a common computer science right of passage).

Let’s begin the project now with a repeat loop. You’ll need to know some things about prime numbers and %% to begin…

Prime numbers

In mathematics, a prime number is a number that can be divided evenly only by itself and the number one. In other words, if you try to divide a prime number by any number less than itself, you get a remainder (to keep things simple, let’s not worry about zero or negative numbers).

For example, five is a prime number because you get a remainder when you divide it by two, three, and four:

## [1] 1
## [1] 1.25
## [1] 1.666667
## [1] 2.5
## [1] 5

Interestingly, it is hard to prove that a number is prime…

…unless you are a computer. Then you can divide the number by every number less than itself and show that the result has a remainder.

%%

To do this, you’ll need to use the modulo function, a %% b, which we met in the control flow tutorial. Modulo is an arithmetic operator that returns the remainder of dividing a by b. For example, when you divide five by five, nothing is left over. When you divide five by four, one is left over. When you divide five by three, two is left over. And so on.

5 %% 5
5 %% 4
5 %% 3
5 %% 2
5 %% 1

## [1] 0
## [1] 1
## [1] 2
## [1] 1
## [1] 0

Modulo doesn’t return a decimal, it returns the number that remains once you subtract the largest multiple of b from a (i.e the number that you would get for the remainder if you did long division).

is_prime

The code below uses a repeat loop to provide the beginnings of a function. Complete the loop:

Change n to begin at 2 instead of 1
Add an if statement that ends the loop and returns TRUE if/when n == x.
Add a second if statement that that checks whether x %% n == 0 and returns FALSE if so.
Then Click Submit Answer

is_prime <- function(x) {
  n <- 1
  repeat {
    print(n)
    
    
    n <- n + 1
  }
}

is_prime <- function(x) {
  n <- 2
  repeat {
    print(n)
    if (n == x) return(TRUE)
    if (x %% n == 0) return(FALSE)
    n <- n + 1
  }
}

Use is_prime to check whether or not 89 is a prime number. Click Submit Answer to run your code.

is_prime(89)

while

while loops run while a logical condition is true.

And, hey! So does our repeat loop! It runs while n is less than x. How do you know? Because it stops and returns TRUE when n equals x.

is_prime <- function(x) {
  n <- 2
  repeat {
    print(n)
    if (n == x) return(TRUE)
    if (x %% n == 0) return(FALSE)
    n <- n + 1
  }
}

Let’s replace repeat with while.

while takes a logical test, like if, and a chunk of code, like repeat. while will repeat the code until the logical test returns FALSE (unless something in the code ends the loop first).

n <- 1
while (n <= 5) {
  print(n)
  n <- n + 1
}

## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5

You can read while as “while this condition is true, repeat that.”

Can you use while?

Rewrite is_prime to use while instead of repeat. The new loop should run while n is less than x.
Have is_prime return TRUE in its last line, after the loop is run.
- Under which conditions, will the function return TRUE?
Then click Submit Answer.

is_prime <- function(x) {
  n <- 2
  repeat {
    print(n)
    if (n == x) return(TRUE)
    if (x %% n == 0) return(FALSE)
    n <- n + 1
  }
}

"The new loop should run while n is less than x."

is_prime <- function(x) {
  n <- 2
  while (n < x) {
    print(n)
    if (x %% n == 0) return(FALSE)
    n <- n + 1
  }
  TRUE
}

Congratulations! Your while loop works.

is_prime(7)

## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6

## [1] TRUE

In fact, it does the same thing as your repeat loop. So why use while?

while loops can be easier to write than repeat loops; you do not need to insert your own if statement.

while loops are also easier to read: while makes the code author’s intentions obvious.

Now let’s look at for. for loops provide the same advantages for a different situation. A for loop says “for each of these values do this.”

for

A for loop takes a defined set of values and repeats the loop once for each of the values.

And, hey! So does our while loop! It runs once for each value in the set {2, 3, 4, …x}. How do you know? Because the loop begins with n = 2, increments n by one, and ends when n = x.

is_prime <- function(x) {
  n <- 2
  while (n < x) {
    print(n)
    if (x %% n == 0) return(FALSE)
    n <- n + 1
  }
  TRUE
}

Let’s replace while with for.

for syntax

The syntax of for is similar to while, but instead of taking a logical test, for takes a three part statement:

An object that may, or may not, be used in the code chunk that follows for.
in (this never changes)
A vector of values to iterate over. On each run of the loop, for will assign a different value of the vector to the object named in step 1.

for (n in c(1, 2, 3, 4, 5)) {
  print(n)
}

## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5

for automatically ends the loop once it has iterated over each value in the vector.

for example

for loops are very versatile. For example, for makes it easy to increment over strange sets of values:

for (i in c(1, 10, 200)) {
  print(i)
}

## [1] 1
## [1] 10
## [1] 200

for the win

You can also increment over “un-ordered” sets. You don’t even need to stick to numbers:

for (person in c("Betty", "Veronica", "Archie")) {
  greeting <- paste("Hello", person)
  print(greeting)
}

## [1] "Hello Betty"
## [1] "Hello Veronica"
## [1] "Hello Archie"

seq

That’s great! But how can you pass for the set {2, 3, 4, …x}? You’ll need to do something like that to retool is_prime.

One possibility is to use seq(2, x), which creates a sequence of integers from 2 to x.

seq(1, 10)

##  [1]  1  2  3  4  5  6  7  8  9 10

seq, seq_len and seq_along form a complete family of helper functions for creating sequences in R.

Can you use for?

Convert is_prime to use a for loop that iterates over n in seq(2, x - 1).
- Why wouldn’t you want the loop to go all the way to n = x?
Then click Submit Answer.

is_prime <- function(x) {
  n <- 2
  while (n < x) {
    print(n)
    if (x %% n == 0) return(FALSE)
    n <- n + 1
  }
  TRUE
}

"You no longer need to define n with an initial value, nor do you need to increment n."

is_prime <- function(x) {
  for (n in seq(2, x - 1)) {
    print(n)
    if (x %% n == 0) return(FALSE)
  }
  TRUE
}

Here’s an odd question: what if you wanted to skip some value in a loop?

For example, what if we wanted to skip n = 5 when we run our loop?

You can do that with the last loop helper provided by R, next.

When R encounters next in a loop, it will move on to the next iteration of the loop without executing the rest of the loop. Here, when n = 5, next causes R to move on to the next iteration of the loop (where n will equal 6).

is_prime <- function(x) {
  for (n in seq(2, x - 1)) {
    if (n == 5) next
    print(n)
    if (x %% n == 0) return(FALSE)
  }
  TRUE
}

If five is the last value in the loop, next will cause R to exit the loop.

Quiz

n <- 1
while (n < 2) {
  print(n)
  next
  n <- n + 1
}

Congratulations! You’ve learned how to use all of R’s loop functions! To be a loop master, click on the button to learn when to use loops in R and when not to.

When to use loops

I mentioned earlier that loops should not appear as frequently in your R code as they would in your C, C++, or python code. Why is this? Because R is an extremely vectorized language.

Vectorization

By “vectorized” I mean that most R functions are designed to take vectors as input and to return vectors as output. So if, for example, you’d like to take the square root of every value in a vector, you do not need to loop over the values in the vector…

x <- c(1, 2, 3, 4, 5)
y <- double(length = 5)
for (i in seq_along(x)) {
  y[i] <- sqrt(i)
}
y

## [1] 1.000000 1.414214 1.732051 2.000000 2.236068

…you can just pass the vector to the square root function.

sqrt(x)

## [1] 1.000000 1.414214 1.732051 2.000000 2.236068

And the best part about R’s vectorization is that it is very fast!

R’s arithmetic operators are also vectorized. If you add two vectors, R will add each pair of elements in the vectors and return the results as a new vector.

c(1, 2, 3, 4, 5) + c(1, 2, 3, 4, 5)

## [1]  2  4  6  8 10

And if you use R functions to build your functions, then your functions will inherit R’s vectorization:

round_square <- function(vec) {
  sqs <- sqrt(vec)
  round(sqs, digits = 2)
}

round_square(x)

## [1] 1.00 1.41 1.73 2.00 2.24

Vectorization reduces the need to use loops. You can think of each vectorized function as implementing a loop for you.

recursion

Some problems cannot be solved with vectorized functions, but even for these you do not necessarily need to use a loop.

For example, we could solve the prime problem with recursion. This is the strategy of having a function call itself. Until one of the recursively recalled functions returns an answer that is then passed up the call stack.

is_prime <- function(x, n = 2) {
  if (n == x) return(TRUE)
  else if (x %% n == 0) return(FALSE)
  else is_prime(x, n = n + 1)
}

is_prime(89)

## [1] TRUE

is_prime(88)

## [1] FALSE

map

Or you could use a map function from the purrr package.

It is all up to you. The moral is that loops are not an all purpose tool in R. They are best reserved for problems that are not easy to solve with a vectorized function.

To learn more about when you definitely should use a loop, check out Hadley Wickham’s three suggestions in Advanced R (be aware, it is advanced).