## Warning in file(filename, "r", encoding = encoding): URL
## 'https://metrics.rstudioprimers.com/learnr/installClient': status was 'Couldn't
## resolve host name'
[1] “Warning: An error occurred with the client code.”
Welcome
Welcome to R
R is easiest to use when you know how the R language works. This tutorial will teach you the implicit background knowledge that informs every piece of R code. You’ll learn about:
- functions and their arguments
- objects
- R’s basic data types
- R’s basic data structures including vectors and lists
- R’s package system
Functions
Functions
Run a function
Can you use the sqrt()
function in the chunk below to
compute the square root of 962?
sqrt(961)
Code
Use the code chunk below to examine the code that sqrt()
runs.
sqrt
lm
Compare the code in sqrt()
to the code in another R
function, lm()
. Examine lm()
’s code body in
the chunk below.
lm
help pages
Wow! lm()
runs a lot of code. What does it do? Open the
help page for lm()
in the chunk below and find out.
?lm
Code comments
What do you think the chunk below will return? Run it and see. The
result should be nothing. R will not run anything on a line after a
#
symbol. This is useful because it lets you write human
readable comments in your code: just place the comments after a
#
. Now delete the #
and re-run the chunk. You
should see a result.
# sqrt(961)
sqrt(961)
Arguments
Arguments
args()
rnorm()
is a function that generates random variables
from a normal distribution. Find the arguments of
rnorm()
.
args(rnorm)
optional arguments
rnorm() 1
Use rnrom()
to generate 100 random normal values with a
mean of 100 and a standard deviation of 15.
rnorm(100, mean = 100, sd = 50)
rnorm() 2
Can you spot the error in the code below? Fix the code and then re-run it.
rnorm(100, mu = 100, sd = 50)
rnorm(100, mean = 100, sd = 50)
Objects
Objects
Object names
You can choose almost any name you like for an object, as long as the
name does not begin with a number or a special character like
+
, -
, *
, /
,
^
, !
, @
, or
&
.
Using objects
In the code chunk below, save the results of
rnorm(100, mean = 100, sd = 15)
to an object named
data
. Then, on a new line, call the hist()
function on data
to plot a histogram of the random
values.
data <- rnorm(100, mean = 100, sd = 15)
hist(data)
What if?
What do you think would happen if you assigned data
to a
new object named copy
, like this? Run the code and then
inspect both data
and copy
.
data <- rnorm(100, mean = 100, sd = 15)
copy <- data
data <- rnorm(100, mean = 100, sd = 15)
copy <- data
data
copy
Data sets
Objects provide an easy way to store data sets in R. In fact, R comes
with many toy data sets pre-loaded. Examine the contents of
iris
to see a classic toy data set. Hint: how could you
learn more about the iris
object?
iris
rm()
What if you accidentally overwrite an object? If that object came
with R or one of its packages, you can restore the original version of
the object by removing your version with rm()
. Run
rm()
on iris
below to restore the iris data
set.
iris <- 1
iris
iris <- 1
iris
rm(iris)
iris
Vectors
Vectors
Create a vector
In the chunk below, create a vector that contains the integers from one to ten.
c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
:
If your vector contains a sequence of contiguous integers, you can
create it with the :
shortcut. Run 1:10
in the
chunk below. What do you get? What do you suppose 1:20
would return?
1:10
[]
You can extract any element of a vector by placing a pair of brackets
behind the vector. Inside the brackets place the number of the element
that you’d like to extract. For example, vec[3]
would
return the third element of the vector named vec
.
Use the chunk below to extract the fourth element of
vec
.
vec <- c(1, 2, 4, 8, 16)
vec <- c(1, 2, 4, 8, 16)
vec[4]
More []
You can also use []
to extract multiple elements of a
vector. Place the vector c(1,2,5)
between the brackets
below. What does R return?
vec <- c(1, 2, 4, 8, 16)
vec[]
vec <- c(1, 2, 4, 8, 16)
vec[c(1,2,5)]
Names
If the elements of your vector have names, you can extract them by
name. To do so place a name or vector of names in the brackets behind a
vector. Surround each name with quotation marks,
e.g. vec2[c("alpha", "beta")]
.
Extract the element named gamma from the vector below.
vec2 <- c(alpha = 1, beta = 2, gamma = 3)
vec2 <- c(alpha = 1, beta = 2, gamma = 3)
vec2["gamma"]
Vectorised operations
Predict what the code below will return. Then look at the result.
c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) + c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Vector recycling
Predict what the code below will return. Then look at the result.
1 + c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Types
Types
Atomic types
What type?
Integers
Create a vector of integers from one to five. Can you imagine why you might want to use integers instead of numbers/doubles?
c(1L, 2L, 3L, 4L, 5L)
Floating point arithmetic
Computers must use a finite amount of memory to store decimal numbers (which can sometimes require infinite precision). As a result, some decimals can only be saved as very precise approximations. From time to time you’ll notice side effects of this imprecision, like below.
Compute the square root of two,square the answer (e.g. multiply the square root of two by the square root of two), and then subtract two from the result. What answer do you expect? What answer do you get?
sqrt(2) * sqrt(2) - 2
sqrt(2)^2 - 2
Vectors
Character or object?
One of the most common mistakes in R is to call an object when you mean to call a character string and vice versa.
Lists
Lists
Lists vs. vectors
Make a list
Make a list that contains the elements 1001
,
TRUE
, and "stories"
. Give each element a
name.
list(number = 1001, logical = TRUE, string = "stories")
Extract an element
Extract the number 1001 from the list below.
things <- list(number = 1001, logical = TRUE, string = "stories")
things <- list(number = 1001, logical = TRUE, string = "stories")
things$number
Data Frames
You can make a data frame with the data.frame()
function, which works similar to c()
, and
list()
. Assemble the vectors below into a data frame with
the column names numbers
, logicals
,
strings
.
nums <- c(1, 2, 3, 4)
logs <- c(TRUE, TRUE, FALSE, TRUE)
strs <- c("apple", "banana", "carrot", "duck")
nums <- c(1, 2, 3, 4)
logs <- c(TRUE, TRUE, FALSE, TRUE)
strs <- c("apple", "banana", "carrot", "duck")
data.frame(numbers = nums, logicals = logs, strings = strs)
Extract a column
Given that a data frame is a type of list (with named elements), how
could you extract the strings column of the df
data frame
below? Do it.
nums <- c(1, 2, 3, 4)
logs <- c(TRUE, TRUE, FALSE, TRUE)
strs <- c("apple", "banana", "carrot", "duck")
df <- data.frame(numbers = nums, logicals = logs, strings = strs)
df$strings
Packages
Packages
A common error
Load a package
In the code chunk below, load the tidyverse
package.
Whenever you load a package R will also load all of the packages that
the first package depends on. tidyverse
takes advantage of
this to create a shortcut for loading several common packages at once.
Whenever you load tidyverse
, tidyverse
also
loads ggplot2
, dplyr
, tibble
,
tidyr
, readr
, and purrr
.
library(tidyverse)
Quotes
Did you know, library()
is a special function in R? You
can pass library()
a package name in quotes, like
library("tidyverse")
, or not in quotes, like
library(tidyverse)
—both will work! That’s often not the
case with R functions.
In general, you should always use quotes unless you are writing the name of something that is already loaded into R’s memory, like a function, vector, or data frame.
Install packages
But what if the package that you want to load is not installed on your computer? How would you install the dplyr package on your own computer?
install.packages("dplyr")
Congratulations!
Congratulations. You now have a formal sense for how the basics of R work. Although you may think of your self as a Data Scientist, this brief Computer Science background will help you as you analyze data. Whenever R does something unexpected, you can apply your knowledge of how R works to figure out what went wrong.