Module 3 Exploring our IDE (Rstudio)

3.1 Before class #3

Install R and RStudio.

We are using RStudio as our IDE for this course. If you are running your R code in your computer, you need to install both R and RStudio. Alternatively, you can create a free account at http://rstudio.cloud and run your R code in the cloud. Either way, we will be using the same IDE (i.e., RStudio).

What’s an IDE? IDE stands for integrated development environment, and its goal is to facilitate coding by integrating a text editor, a console and other tools into one window.

3.1.1 I’ve never installed R and RStudio in my computer OR I’m not sure I have R and RStudio installed in my computer

  1. Download and install R from https://cran.r-project.org (If you are a Windows user, first determine if you are running the 32 or the 64 bit version)
  2. Download and install RStudio from https://rstudio.com/products/rstudio/download/#download

Here’s a video on how to install R and RStudio on a mac.

3.1.2 I already have R and RStudio installed

  1. Open RStudio
  2. Check your R version by entering sessionInfo() on your console.
  3. The latest release for R was June 22, 2020 (R version 4.0.2 Taking Off Again). If your R version is older than the most recent version, please follow step 1 in the previous section to update R.
  4. Check your RStudio version, if your version is older than Version 1.3.x, please follow step 2 in the previous section to update RStudio.

How often should I update R and RStudio? Always make sure that you have the latest version of R, RStudio, and the packages you’re using in your code to ensure you are not running into bugs that are caused by having older versions installed in your computer.

When asked, Jenny Bryan summarizes the importance of keeping your system up-to-date saying that “You will always eventually have a reason that you must update. So you can either do that very infrequently, suffer with old versions in the middle, and experience great pain at update. Or admit that maintaining your system is a normal ongoing activity, and do it more often.”

 

You can ensure your packages are also up-to-date by clicking on “Tools” on your RStudio top menu bar, and selecting “Check for Packages Updates…”

3.2 Why learn R?

R is both a programming language and a free software environment for statistical computing and graphics. In addition to being free, here are other reasons to learn R:

  • R is popular. According to Robert A. Muenchen’s post on the popularity of data science software (which is updated frequently), R is among the top 5 technologies that are mentioned in data science job ads on indeed.com.

  • R is very powerful and versatile. From creating websites (like this bookdown you’re reading right now) to building machine learning models, R has it all.

  • The R community is active and very supportive. Because R is so popular, there are a number of forums on R. A good way to get a glimpse on how active the R community is to follow #rstats on twitter.

3.3 Why use RStudio?

You can just use R, but RStudio is an IDE that makes using R easier and more fun. Some features that make RStudio the IDE that many data scientists use:

  • RStudio is free and open source.

  • RStudio contains a full-feature integrated text editor, with tab-completion, spellcheck, etc.

  • RStudio is a cross-platform interface that looks the same across platforms.

  • RStudio allows you to organize your data science projects so you’re not always hunting for the right script that goes with the data you want to analyze. (also, it integrates nicely with rmarkdown and knitr)

3.4 Create an R Project

In today’s class, we will focus on situating ourselves around our IDE. For every lesson, we will either start a new R project or open an R project we’ve been working on.

Why create a RStudio project? RStudio projects make it easier to keep your projects organized, since each project has their own working directory, workspace, history, and source documents. In other words, it’s much easier to open an R project and not have to worry about setting your working directory than to try to hunt down your files.

Here are the steps we are starting with today:

  1. Start a new R project

  2. Create a new R script

  3. Save that R script as 01-class_one

 

CHALLENGE

Take a moment to look around your IDE. What are the main panes on the RStudio interface. What are the 4 main areas of the interface? Can you guess what each area is for?

3.5 Operations and Objects

Let’s start by using R as a calculator. On your console type 3 + 3 and hit enter.

3 + 3
## [1] 6

What symbols do we use for all basic operations (addition, subtraction, multiplication, and division)? What happens if you type 3 +?

Let’s save our calculation into an object, by using the assignment symbol <-.

sum_result <- 3 + 3

Take a moment to look around your IDE once again. What has changed?

Now, let’s use this new object in our calculation

sum_result + 3
## [1] 9

Take a moment to look around your IDE once again. Has anything changed?

What else can we do with an object?

class(sum_result)
## [1] "numeric"

R is primarily a functional programming language. That means that there pre-programmed functions in base R such as class() and that you can also write your own functions (more on that later).

Type ?class in your console and hit enter to get more information about this function.

CHALLENGE

Create an object called daisys_age that holds the number 8. Multiply daisys_age by 4 and save the results in another object called daisys_human_age

Imagine I had multiple pets (unfortunately, that is not true, Daisy is my only pet). I can create a vector to hold multiple numbers representing the age of each of my pets.

my_pets_ages <- c(8, 2, 6, 3, 1)

Take a moment to look around your IDE once again. What has changed?

What is the class of the object my_pets_ages?

Now let’s multiply this vector by 4.

my_pets_ages * 4
## [1] 32  8 24 12  4

Errors are pretty common when writing code in any programming language, so be ready to read error messages and debug your code. Let’s insert a typing error in our previous code:

my_pets_ages <- c(8, 2, 6, '3', 1)

CHALLENGE

Try to multiply my_pets_ages by 4. What happens? How can we debug our code to find out what is causing the error?