## Warning in file(filename, "r", encoding = encoding): URL
## 'https://metrics.rstudioprimers.com/learnr/installClient': status was 'SSL
## connect error'
[1] “Warning: An error occurred with the client code.”
repeat
Many Data Science tasks require you to do the same thing over and over again. This is boring work—but not for your computer!
But how do you ask your computer to repeat a task?
The purrr package provides one way. I recommend that you check out the Iteration primer to learn how to use purrr and its map functions.
This tutorial will show you another way. You will learn how to repeat tasks with R’s loop functions. You know know enough to understand loops, and loops will expand your ability to write useful functions.
Did you know?
Did you know that some languages, like C, C++ and python, use loops as an all purpose programming tool. This approach will not work well in R.
R’s loops are the right tool for a very specific job, but they lose out to other methods for other jobs. This is due to R’s user-oriented design: most R functions implement their own pre-optimized loops behind the scenes in a lower level language. These built-in loops are much faster than any loop you could write at the command line.
But don’t let this drive you loopy! You’ll learn when and when not to use loops in R if you read through to the end of When to use loops. But first, we have a more pressing topic:
What is a loop anyways?
loops
R contains three types of loops
repeat
while
for
The simplest of these are repeat
loops.
repeat
repeat
repeats an expression over and over again. To use repeat
, type repeat
without trailing parentheses. Then use braces to enclose one or more lines of code, e.g.
repeat {
print("Hello")
}
## Hello
## Hello
## Hello
## Hello
## Hello
## ...
repeat
will execute all of the code between the braces. Then it will execute it again. And again. And again…
break
When an R loop encounters the break
command, it exits the loop. This lets you schedule the end of a repeat
loop.
- How many times will R repeat the code in the loop below? Make a prediction then Click Submit Answer to see if you are right.
n <- 1
repeat {
print(n)
if (n == 5) break
n <- n + 1
}
n <- 1
repeat {
print(n)
if (n == 5) break
n <- n + 1
}
Here is a good question:
What happens to the value of n
when you run the loop?
n
is defined as 1
outside of the loop. Is n
still 1
when the loop is finished?
<- 1
n repeat {
print(n)
if (n == 5) break
<- n + 1
n
} n
- Click Submit Answer to find out.
n
n
count
One way to prevent loops from altering objects in your environment is to put the loop into a function. For example, here is a function that counts from one to the number x
. Do you see how it works?
<- function(x) {
count <- 1
n repeat {
print(n)
if (n == x) break
<- n + 1
n
} }
- Can you write something similar? Write a function named
count_down
that counts fromx
to one. Be sure to arrange for the loop to end! Then click Submit Answer.
count_down <- function(x) {
}
"Begin by setting n = x. In which direction should you increment n to get to one?"
"Increment n by subtracting one from it on each repetition. You will want to do this _after_ you print the value of n for that repetition."
"Arrange to break the loop after you print n when n = 1."
count_down <- function(x) {
n <- x
repeat {
print(n)
if (n == 1) break
n <- n - 1
}
}
return
break
is not the only way to end a loop in R.
If your loop is in a function, you can end the loop with a return()
statement. R will stop executing the function (and therefore the loop) and return the value supplied by return()
.
- Alter your function below to end the loop with
return("Blast off!")
. You will no longer need thebreak
command. Then click Submit Answer.
count_down <- function(x) {
n <- x
repeat {
print(n)
if (n == 1) break
n <- n - 1
}
}
"Replace `break` with something else."
count_down <- function(x) {
n <- x
repeat {
print(n)
if (n == 1) return("Blast off!")
n <- n - 1
}
}
count_down(10)
## [1] 10
## [1] 9
## [1] 8
## [1] 7
## [1] 6
## [1] 5
## [1] 4
## [1] 3
## [1] 2
## [1] 1
## [1] "Blast off!"
Congratulations!
A loop executes a piece of code multiple times, perhaps allowing changes to accumulate as it goes.
You’ve learned the essence of loops, now it is time to learn the forms. In the next sections, you will learn how to refine repeat
loops into two common types of loops: 1. while
loops, and 1. for
loops
Each of these is a repeat
loop adapted to a specific task.
Along the way, you will work through a simple project: you will write and then refine a function that checks whether a number is prime. (This is a common computer science right of passage).
Let’s begin the project now with a repeat
loop. You’ll need to know some things about prime numbers and %%
to begin…
Prime numbers
In mathematics, a prime number is a number that can be divided evenly only by itself and the number one. In other words, if you try to divide a prime number by any number less than itself, you get a remainder (to keep things simple, let’s not worry about zero or negative numbers).
For example, five is a prime number because you get a remainder when you divide it by two, three, and four:
5 / 5
5 / 4
5 / 3
5 / 2
5 / 1
## [1] 1
## [1] 1.25
## [1] 1.666667
## [1] 2.5
## [1] 5
Interestingly, it is hard to prove that a number is prime…
…unless you are a computer. Then you can divide the number by every number less than itself and show that the result has a remainder.
%%
To do this, you’ll need to use the modulo function, a %% b
, which we met in the control flow tutorial. Modulo is an arithmetic operator that returns the remainder of dividing a
by b
. For example, when you divide five by five, nothing is left over. When you divide five by four, one is left over. When you divide five by three, two is left over. And so on.
5 %% 5
5 %% 4
5 %% 3
5 %% 2
5 %% 1
## [1] 0
## [1] 1
## [1] 2
## [1] 1
## [1] 0
Modulo doesn’t return a decimal, it returns the number that remains once you subtract the largest multiple of b
from a
(i.e the number that you would get for the remainder if you did long division).
is_prime
The code below uses a repeat
loop to provide the beginnings of a function. Complete the loop:
- Change
n
to begin at2
instead of1
- Add an
if
statement that ends the loop and returnsTRUE
if/whenn == x
. - Add a second
if
statement that that checks whetherx %% n == 0
and returnsFALSE
if so. - Then Click Submit Answer
is_prime <- function(x) {
n <- 1
repeat {
print(n)
n <- n + 1
}
}
is_prime <- function(x) {
n <- 2
repeat {
print(n)
if (n == x) return(TRUE)
if (x %% n == 0) return(FALSE)
n <- n + 1
}
}
- Use
is_prime
to check whether or not 89 is a prime number. Click Submit Answer to run your code.
is_prime(89)
while
while
loops run while a logical condition is true.
And, hey! So does our repeat
loop! It runs while n is less than x. How do you know? Because it stops and returns TRUE
when n equals x.
<- function(x) {
is_prime <- 2
n repeat {
print(n)
if (n == x) return(TRUE)
if (x %% n == 0) return(FALSE)
<- n + 1
n
} }
Let’s replace repeat
with while
.
while
takes a logical test, like if
, and a chunk of code, like repeat
. while
will repeat the code until the logical test returns FALSE
(unless something in the code ends the loop first).
<- 1
n while (n <= 5) {
print(n)
<- n + 1
n }
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
You can read while
as “while this condition is true, repeat that.”
Can you use while?
- Rewrite
is_prime
to usewhile
instead ofrepeat
. The new loop should run while n is less than x. - Have
is_prime
returnTRUE
in its last line, after the loop is run.- Under which conditions, will the function return
TRUE
?
- Under which conditions, will the function return
- Then click Submit Answer.
is_prime <- function(x) {
n <- 2
repeat {
print(n)
if (n == x) return(TRUE)
if (x %% n == 0) return(FALSE)
n <- n + 1
}
}
"The new loop should run while n is less than x."
is_prime <- function(x) {
n <- 2
while (n < x) {
print(n)
if (x %% n == 0) return(FALSE)
n <- n + 1
}
TRUE
}
Congratulations! Your while
loop works.
is_prime(7)
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] TRUE
In fact, it does the same thing as your repeat
loop. So why use while
?
while
loops can be easier to write than repeat
loops; you do not need to insert your own if
statement.
while
loops are also easier to read: while
makes the code author’s intentions obvious.
Now let’s look at for
. for
loops provide the same advantages for a different situation. A for loop says “for each of these values do this.”
for
A for loop takes a defined set of values and repeats the loop once for each of the values.
And, hey! So does our while loop! It runs once for each value in the set {2, 3, 4, …x}. How do you know? Because the loop begins with n = 2, increments n by one, and ends when n = x.
<- function(x) {
is_prime <- 2
n while (n < x) {
print(n)
if (x %% n == 0) return(FALSE)
<- n + 1
n
}TRUE
}
Let’s replace while
with for
.
for syntax
The syntax of for
is similar to while
, but instead of taking a logical test, for
takes a three part statement:
- An object that may, or may not, be used in the code chunk that follows
for
. in
(this never changes)- A vector of values to iterate over. On each run of the loop,
for
will assign a different value of the vector to the object named in step 1.
for (n in c(1, 2, 3, 4, 5)) {
print(n)
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
for
automatically ends the loop once it has iterated over each value in the vector.
for example
for
loops are very versatile. For example, for
makes it easy to increment over strange sets of values:
for (i in c(1, 10, 200)) {
print(i)
}
## [1] 1
## [1] 10
## [1] 200
for the win
You can also increment over “un-ordered” sets. You don’t even need to stick to numbers:
for (person in c("Betty", "Veronica", "Archie")) {
<- paste("Hello", person)
greeting print(greeting)
}
## [1] "Hello Betty"
## [1] "Hello Veronica"
## [1] "Hello Archie"
seq
That’s great! But how can you pass for
the set {2, 3, 4, …x}? You’ll need to do something like that to retool is_prime
.
One possibility is to use seq(2, x)
, which creates a sequence of integers from 2 to x.
seq(1, 10)
## [1] 1 2 3 4 5 6 7 8 9 10
seq
, seq_len
and seq_along
form a complete family of helper functions for creating sequences in R.
Can you use for?
- Convert
is_prime
to use a for loop that iterates overn in seq(2, x - 1)
.- Why wouldn’t you want the loop to go all the way to n = x?
- Then click Submit Answer.
is_prime <- function(x) {
n <- 2
while (n < x) {
print(n)
if (x %% n == 0) return(FALSE)
n <- n + 1
}
TRUE
}
"You no longer need to define n with an initial value, nor do you need to increment n."
is_prime <- function(x) {
for (n in seq(2, x - 1)) {
print(n)
if (x %% n == 0) return(FALSE)
}
TRUE
}
Here’s an odd question: what if you wanted to skip some value in a loop?
For example, what if we wanted to skip n = 5 when we run our loop?
You can do that with the last loop helper provided by R, next
.
next
When R encounters next
in a loop, it will move on to the next iteration of the loop without executing the rest of the loop. Here, when n = 5, next
causes R to move on to the next iteration of the loop (where n will equal 6).
<- function(x) {
is_prime for (n in seq(2, x - 1)) {
if (n == 5) next
print(n)
if (x %% n == 0) return(FALSE)
}TRUE
}
If five is the last value in the loop, next
will cause R to exit the loop.
Quiz
<- 1
n while (n < 2) {
print(n)
next
<- n + 1
n }
Congratulations! You’ve learned how to use all of R’s loop functions! To be a loop master, click on the button to learn when to use loops in R and when not to.
When to use loops
I mentioned earlier that loops should not appear as frequently in your R code as they would in your C, C++, or python code. Why is this? Because R is an extremely vectorized language.
Vectorization
By “vectorized” I mean that most R functions are designed to take vectors as input and to return vectors as output. So if, for example, you’d like to take the square root of every value in a vector, you do not need to loop over the values in the vector…
<- c(1, 2, 3, 4, 5)
x <- double(length = 5)
y for (i in seq_along(x)) {
<- sqrt(i)
y[i]
} y
## [1] 1.000000 1.414214 1.732051 2.000000 2.236068
…you can just pass the vector to the square root function.
sqrt(x)
## [1] 1.000000 1.414214 1.732051 2.000000 2.236068
And the best part about R’s vectorization is that it is very fast!
R’s arithmetic operators are also vectorized. If you add two vectors, R will add each pair of elements in the vectors and return the results as a new vector.
c(1, 2, 3, 4, 5) + c(1, 2, 3, 4, 5)
## [1] 2 4 6 8 10
And if you use R functions to build your functions, then your functions will inherit R’s vectorization:
<- function(vec) {
round_square <- sqrt(vec)
sqs round(sqs, digits = 2)
}
round_square(x)
## [1] 1.00 1.41 1.73 2.00 2.24
Vectorization reduces the need to use loops. You can think of each vectorized function as implementing a loop for you.
recursion
Some problems cannot be solved with vectorized functions, but even for these you do not necessarily need to use a loop.
For example, we could solve the prime problem with recursion. This is the strategy of having a function call itself. Until one of the recursively recalled functions returns an answer that is then passed up the call stack.
<- function(x, n = 2) {
is_prime if (n == x) return(TRUE)
else if (x %% n == 0) return(FALSE)
else is_prime(x, n = n + 1)
}
is_prime(89)
## [1] TRUE
is_prime(88)
## [1] FALSE
map
Or you could use a map function from the purrr package.
It is all up to you. The moral is that loops are not an all purpose tool in R. They are best reserved for problems that are not easy to solve with a vectorized function.
To learn more about when you definitely should use a loop, check out Hadley Wickham’s three suggestions in Advanced R (be aware, it is advanced).