# Introduction¶

In this first in a number of tutorials, we’ll cover the very basics of R. If you’ve programmed before you can skip much of this. But regardless of your background, we hope you’ll find this and subsequent tutorials useful for learning R’s many tools for graphing, statistical analysis, and data collection and management — or what we collectively might call “data science.”

## Installing R¶

First, download R *free* here!

After you’ve downloaded R itself, you will probably want to also download a program called RStudio (installation instructions here; note that you need to already have R to use RStudio). RStudio is a little “helper” program that makes it a little easier to write code for R (it is what is referred to as an “integrated development environment” (IDE)). There are also a lot of other IDEs, but RStudio is the easiest and one of the most popular.

## Interacting with R¶

R has a text-based interface program, which means you can’t ask it to do things by clicking buttons or using drop-down menus. Instead, it has a command prompt where you type messages to R. The place you can type has a little right arrow symbol (`>`

). Just type your message after that right arrow and hit return. In screenshot below, for example, I’m about to ask R to print the phrase “Hello!” (though I haven’t hit return, so it hasn’t done anything yet):

After you type a command to R, hit return and R will try and do what you’ve asked of it. If I hit return after `print("Hello!")`

, for example, R will print out the phrase “Hello!”:

You will notice that after it printed out “Hello!”, a new right arrow appeared. That’s R’s way of saying it’s done doing the last thing you asked it to do.

### Code Examples On This Site¶

On this website, you’ll find that code examples don’t look quite like they do when you’re typing in R yourself. Instead, you’ll see code appear in grey blocks with a number on the left side. Below these blocks, you will see the output R has returned after running that code. For example, here’s that same `"Hello!"`

line in the style used on this site:

```
[57]:
```

```
"Hello!"
```

In addition, some code will include “comments”. Comments are notes placed in someone’s code to explain what’s going on to other programmers. Comments always start with a `#`

in R, which tells R that the text that follows is not something it should try and execute. Comments will always appear in italics and in yellow.

```
[59]:
```

```
# This is a comment. In the next line, I'll add 2 and 3.
2 + 3
```

## Basic Math in R¶

Now that we’ve learned how to pass commands to R, we can start asking R to do things for us. For example, R can do all the normal math operations you are familiar with:

```
[60]:
```

```
# Addition
2 + 2
```

```
[62]:
```

```
# Multiplication
2 * 3
```

```
[63]:
```

```
# Division
4 / 2
```

```
[64]:
```

```
# And even exponentiation (e.g. 2 raised to the third power)
2 ^ 3
```

## Variables¶

Congratulations! You now know how to do math in R!

If we want to do more than use R as a calculator, though, we need to be able to not only do math problems, but also store the results of our calculations so we can reuse them in the future, or combine the results of lots of different calculations. In the examples above, R did the math we asked it to do, and printed out the results, but it didn’t keep a copy of those results anywhere.

In order to store the *value* of our calculations, we need to assign them to a *variable*. Once we’ve assigned a value to a variable, we can then recall that value any time by invoking the variable. For example, let’s calculate the weight of a velociraptor in pounds.

First, let’s store the weight of a velociraptor in kilograms (estimated to be 113 kg) in a variable called `velociraptor_weight_in_kg`

:

```
[24]:
```

```
velociraptor_weight_in_kg = 113
```

Basically, all I’ve done is given a name (the variable name) to a value (in this case, 113). Now any time I use that variable name, R knows that that variable name is just a stand-in for 113. For example, if you just type a variable name in R, it will tell you the value associated with that variable:

```
[14]:
```

```
velociraptor_weight_in_kg
```

And now we can do calculations with that variable. There are 2.2 pounds in a kilogram, so to get a velociraptor’s weight in pounds, we can just multiple our weight in kg variable by the conversion factor:

```
[16]:
```

```
velociraptor_weight_in_kg * 2.2
```

We can also do math with multiple variables, because really anywhere you see a variable, you can just imagine that the value associated with the variable is there instead.

For example, suppose I have two pet dinosaurs, and my partner has three dinosaurs. If we got married, how many dinosaurs would we have? Let’s do this super-complicated math using variables.

```
[30]:
```

```
nick_pet_dinosaurs = 2
adriane_pet_dinosaurs = 3
nick_pet_dinosaurs + adriane_pet_dinosaurs
```

And if we wanted to, we could also store that new value in a new variable called `family_pet_dinosaurs`

.

```
[31]:
```

```
family_pet_dinosaurs = nick_pet_dinosaurs + adriane_pet_dinosaurs
family_pet_dinosaurs
```

One important thing about variables is that you can change the value associated with a variable. Suppose that while walking to work, I stumbled upon a truely adorable Nigersaurus and couldn’t resist adopting her.

(Photo Credit: Matt Martyniuk (Dinoguy2))

If that happened, we’d need to update the number of pet’s I have by 1!

```
[32]:
```

```
nick_pet_dinosaurs = nick_pet_dinosaurs + 1
```

Now if we ask R for the value of `nick_pet_dinosaurs`

, we’ll see it has increased by 1:

```
[65]:
```

```
nick_pet_dinosaurs
```

Note there’s something a little weird about the order in which things happen here: when we assign something to a variable by writing `variable_name = [some expression]`

, **R evaluates the expression on the right first, then assigns the results of that expression to the variable on the left hand side.**

Given how we normally read left-to-right, this can be a little confusing. So what R did here was:

*first*calculate`nick_pet_dinosaurs + 1`

(which is the same as`2 + 1`

),*then*assigned the value of that expression (`3`

) to the variable`nick_pet_dinsoaurs`

, replacing the old value of`2`

.

### Variable Exercises¶

OK, this is a great time to pause and try a few exercises for yourself.

Let’s suppose that you have a dinosaur zoo. In your zoo, you have two T-Rexes, three Unaysaurus, and five Spinosaurus

- Create variables for the number of each dino you have called
`my_trexes`

,`my_unas`

, and`my_spinos`

. - Now use those variables to calculate how many total dinosaurs you have.
- Oh no! One of your t-rexes got out and ate an Unaysaurus. Decrease the value of
`my_unas`

by one. - Double oh no! Your T-Rexes were male and female, and they just had a baby! Increase you number T-Rexes by one!
- Sadly, one of your Spinosauruses died of old age. :( Decreases your count of Spinosauruses by one.
- How many dinos do you have now? You’ve probably lost count of all these changes, but thankfully they’re all stored in variables, so you can just add them all up!

Answers to exercises can be found here, but only go to that page if you get REALLY stuck! The best way to learn to program is to try things out and see what works, so don’t deny yourself the learning opportunity that process provides but looking at answers too quickly!

`=`

), and the other is with the two symbols that make an arrow (`<-`

). So the following two commands are exactly the same:```
[34]:
```

```
x = 72
x <- 72
```

## Types of Data¶

Up till now, we’ve only been working with numbers, but R is actually equiped to work with a number of different *kinds* of data. In the course of this tutorial we’ll introduce all of them, but there are really three main ones to be aware of:

`numeric`

and`integer`

: The main data types for numbers. These two types are*slightly*different (integers is restricted to, we, integers; numeric can be an integer or a number with decimals), but you can think of them as interchangable for now.`character`

: Text data, like a person’s name, or a quote from a book. Written with`"`

before and after (or single quotes (`'`

) before and after if you’d prefer.`logical`

: Data that only takes on the values of true and false. Written`TRUE`

and`FALSE`

If you’re ever unsure of the type of a variable (or more precisely, of the type of the value associated with a variable), you can ask R with the `class()`

function:

```
[41]:
```

```
pi = 3.1416
class(pi)
```

```
[42]:
```

```
mystery_novel = "T'was a dark and story night"
class(mystery_novel)
```

```
[44]:
```

```
my_logical = TRUE
class(my_logical)
```

### Characters¶

The value of characters will be evident when we start working with real data, when our datasets will include things like country names, or capitals, or open-ended survey responses. In all these situations, we will use the `character`

type to store text.

Note that if a variable is a character, even if it looks like a number, R will treat it like text and you won’t be able to do things like add or multiple it. For example:

```
a = 5
b = "7"
a + b
```

Will generate an error:

```
Error in a + b: non-numeric argument to binary operator
```

Because plus is only defined for numbers, and R sees `"7"`

as text. I’ll show you how to deal with this silly situation below.

### Logicals¶

The use of logicals is less obvious at the moment, but they will eventually prove very important. For now, it’s enough to know they exist, and that the place you are most likely to encounter them is when you write tests. For example:

```
[48]:
```

```
7 > 5
```

```
[49]:
```

```
4 < 3
```

This is also a good time to introduce the double-equal sign. Because we use `=`

to assign values to variables, we can’t use it to test if two things are equal. To ask R to evaluate whether two things are equal, therefore, we use a double-equal sign (`==`

). For example:

```
[50]:
```

```
a = 5
b = 5
a == b
```

```
[51]:
```

```
c = 7
a == c
```

## A Brief Introduction to Functions¶

One last note before we finish off this section: one of the most powerful thing about a language like R is that it’s full of pre-made tools called *functions*. A function is basically a little program inside R. They can do everything from simple operations (like adding up numbers) to extremely complicated operations (fit a machine learning model).

The idea of a function is that takes in a set of *arguments* as an input, and then it *returns* a result. To use a function, you type out the function name, then place the arguments you want to pass to the function inside parentheses. For example, there’s a function called `as.numeric()`

that’s designed to convert a variable that is a character (like `"7"`

) to a numeric value (`7`

) so we can do arithmatic with that value.

So to use it, we pass the function `as.numeric`

the variable (`b`

). The function will then take the value associated with `b`

(`"7"`

), convert it to a numeric value, then return the converted value. For example:

```
[53]:
```

```
a = 5
b = "7"
b_as_a_numeric = as.numeric(b)
a + b_as_a_numeric
```

Note that `as.numeric`

didn’t actually change the value of the variable `b`

– instead, it returned the converted value, and we assigned it to the variable `b_as_a_numeric`

. If we look at `b`

, we will see it is still `"7"`

```
[54]:
```

```
b
```

But `as.numeric`

won’t work on everything. If you pass `as.numeric`

a character that doesn’t look like a number, it’s smart enough to know that there’s no way to convert it, so instead of a number, it returns `NA`

, which is what R uses when it has no idea what to do. For example:

```
silly_example = "This doesn't look like a number at all!"
as.numeric(silly_example)
Warning message in eval(expr, envir, enclos):
“NAs introduced by coercion”
<NA>
```

We’ll talk much more about functions in the future, but for now it’s enough to recognize that they are little programs, and that they operate by accepting inputs (the things placed inside the parenthesis that follow the function name) and return a result to you which you can look at or assign to a variable for later use.

### Exercises for Data Types¶

```
[ ]:
```

```
```