Chapter 1 Introduction to R and RStudio

Welcome to Quantitative Reasoning! My name is Michael Gastner, and I will be one of your instructors in Quantitative Reasoning. This is the first in a series of video tutorials that will teach how to use R and RStudio. We’ll use both throughout this semester to read and analyse data.

R is a programming language with many helpful features when working with data (e.g. to import spreadsheets, calculate summary statistics or prepare graphics). R can do everything we can accomplish with spreadsheet programs such as Excel or Google Sheets, but R can even do a lot more. R is a full-fledged programming language. When we face a complex task (e.g. comparing statistics for 100 different spreadsheets), R can automate the process much more easily than Excel.

Admittedly, when learning the software from scratch, the first steps in R are more challenging than in Excel, but there are several resources available for learning R. Besides these video tutorials, we have written an R Compendium that we’re going to post on our learning management system Canvas. R also contains help documents for every command, including short example code. We learn how to access R’s built-in help later in this tutorial. If neither the R Compendium nor R’s built-in documentation answers our question, we can search the World Wide Web. R has a very active user community, so it’s very likely that we can find an answer online.

RStudio is an integrated development environment (abbreviated by IDE) for R. An IDE has several tools that make it easy to write and run computer code. For example, RStudio has a code editor and a graphical user interface to import data. We’ll learn more about these features soon.

If you’re on a Mac or Windows computer, you can download R from the Comprehensive R Archive Network (https://cran.r-project.org/). The installation should be straightforward. If you’re on Linux, it’s best to use your package manager to install R.

To download RStudio, please go to https://rstudio.com/. From the Products menu, please select RStudio. On the next website, scroll down and click on “Download RStudio Desktop”. From the list of options on the next website, please choose the free RStudio Desktop version. After we installed RStudio, we can find it on our computer by typing “RStudio” into Spotlight on a Mac or using similar tools on Windows or Linux. Click on the icon or hit the “return” key to open RStudio.

Before we start working with RStudio, let’s customise some of its settings. Please go to the menu item “Tools” -> “Global Options”. Look for the drop-down menu that says “Save workspace to .Rdata on exit”. Let’s choose “Never” from the Menu. Next we remove the tick mark for all checkboxes above this dropdown menu. That is, no, we don’t want to reuse or restore anything at startup. These recommended settings are less confusing when learning R and RStudio.

There are a few more editor settings that I personally find useful, so I click on “Code” in the sidebar, and then on the tab “Display”. I put ticks in the boxes in front of “Show line numbers” and “Show margin”. To follow generally recommended practices, I set the margin column to 80. I also tick the box for “Show indent guides.” There are many other options we can change (e.g. font size or background colour). Feel free to play with the settings. When you’re done, please click OK.

Now let’s take a look at the RStudio window. When we open RStudio, we see three panes by default. The pane on the left contains the console. We can think of the console as a luxury version of a pocket calculator. For example, we can type 1+1, hit the return key, and R prints 2. The console is a powerful tool, but we can’t easily save the commands we type in the console. A better tool is the RStudio editor.

We can open the editor by clicking on the menu item “File”. Then we go to “New File” and “R Script”. Alternatively, we can use the keyboard shortcut “Shift-Command-N” on a Mac or “Shift-Control-N” on Windows and Linux. The editor appears as a new pane in the top left corner of the RStudio window.

Let’s run the command 1+1 again, but this time we run it from the editor instead of the console. We type 1+1 and click “Run”. RStudio then copies the line 1+1 to the console and immediately prints the result 2. The right angle bracket in the console indicates that the following command is input, and the 1 in the square bracket on the line below tells us that the following number is the first item of output. Later in this course, we’ll work with larger data sets, so we’ll see larger numbers in the square brackets.

For simple commands like 1+1, there isn’t a big difference between typing in the editor and typing in the console. As we learn in tutorial 04, the editor allows saving our code to a file so that we can easily retrieve our past work. This feature is very handy when we write longer R code. For this reason, it’s a good habit to use the editor as much as possible. Avoid typing commands in the console unless it’s for exploratory work that you don’t care if it isn’t saved.

The upper right pane contains a tab called “Environment”, which can be very useful. The Environment tab lists the variables currently stored in memory. The environment is currently empty because we haven’t defined any variable yet. Let’s change it now. We head back to the editor pane and put x and an arrow composed of a left angle bracket and a hyphen in front of 1+1.

x <- 1+1

The combination of a left angle bracket and a hyphen is called the “assignment operator”, sometimes pronounced “gets” as in “x gets 1+1”. We can see its effect by placing the cursor inside the editor on the line with the assignment operator. Then we click “Run”. Now the variable x is in the environment. It has been assigned the value 1+1, that is 2. We can also retrieve the value of x by typing x at the console followed by the return key.

We could have given the variable a different name (e.g. y). We can also assign a different value than 1+1 to the variable. After finishing this tutorial, try out yourself what happens when you use different variable names and values.

The bottom right pane serves multiple purposes, as we can tell from the various tabs at the top of the pane. One important tab is “Help”. Let’s explore R’s built-in help with an example. Suppose we want to find the maximum of three numbers, say 3, 5 and 4. R has a function max() that finds the maximum of its arguments. In our example, we type in the editor max and in parentheses 3, 5 and 4, separated by commas.

max(3, 5, 4)

Let’s move the cursor onto the line we have just typed and click “Run”. We’ll see the result 5 in the console.

## [1] 5

Suppose we need help with the max() function and would like to see simple examples of how it is used. We access R’s built-in help document for max() by typing ?max into the console followed by the return key.

?max

This command opens the help document in the bottom right pane. Every help document starts with a brief description. If we scroll down, we can find example code. Don’t worry if these examples look cryptic at this stage. As we work with R during this course, they will become clearer. The important point is that we can access the help document for every R function by typing a question mark followed by the function name into the console. You will probably find this feature useful in the future.

Here is a summary of the main points in this tutorial. We learned that the RStudio window has four panes:

  • The editor in the top left. In this pane, we type R commands. The editor pane may not be immediately open when you start RStudio. To open it, either start a new file or open an existing R file.
  • The console pane is in the bottom left. In this pane, we see the output of the commands.
  • The environment pane is in the top right. Here we can see the variables we have defined during our R session.
  • And, finally, in the bottom right pane, we can read pages from R’s documentation in the “help” tab. We can access R’s built-in help with the ?-operator.

We also learned that the assignment operator, composed of a left angle bracket and a hyphen, assigns a value to a variable. In the next tutorial, we use the assignment operator again to define more complicated variables that store more than a single value. We also learn how to extract individual values from such variables.

See you soon.