Introduction to R

Session 1

Session Overview

  1. SBE R Training
  2. What is R?
  3. Installing R
  4. First Steps in R
  5. What is RStudio?
  6. Installing RStudio
  7. First Steps in RStudio
  8. R Script
  9. R Package Management System

R Training Team

R Training Team

  • Nalan Bastürk
  • Tobias Hartl
  • Martin Schumann
  • Stephan Smeekes
  • Ines Wilms

All are econometricians at the Department of Quantitative Economics, and have many years of experience in R.

R Training Team: Tobias Hartl

  • Assistant Professor at QE
  • Research interests: econometrics, state space and factor models, long memory, macro- and climate econometrics
  • Website
  • Not physically present at sessions due to parental leave

R Training Team: Martin Schumann

  • Assistant Professor at QE
  • Research interests: panel data, nonlinear models, difference-in-differences, network data, innovation
  • Website
  • Sessions 1 and 2

R Training Team: Ines Wilms

  • Associate Professor at QE
  • Research interests: econometrics, time series, high-dimensional statistics, forecasting, outlier robustness
  • Website
  • Sessions 1, 3 and 5

SBE R Training

  • R is being used more and more in SBE teaching; master programmes use R extensively, bachelor programs are moving towards it.

  • Very useful for statistical analysis in research as well.

  • Big step to transition from “menu-based” software like SPSS to “script-based” software like R.

    • Note: once familiar with R, step to transition to other similar software, like Python or Julia is much smaller.
  • Request by Education Institute to set up training to make the transition easier.

  • For participants with a UTQ: we are trying to get participation in the training recognised as CPD hours.

  • First official edition of the training; your feedback will be crucial for improvements!

What is R?

What is R?

R

  • is an open-source statistical programming language.
  • available for most operating systems
  • helps users analyze, visualize, and model data—from simple summaries to complex data analysis
  • is extremely popular in statistics and data science, see this article
  • includes thousands of packages (add-ons) that can be used for specialized tasks without a deep understanding of the language and programming skills
  • allows for extensive programming, making it also suitable for advanced or case-specific applications of statistical methods

The combination of the latter two aspects sets R apart and makes it useful for everything ranging from standard ‘basic’ statistical analysis to the development of new methods.

The homepage of R is www.r-project.org from which you can install R and access manuals that provide detailed information about installing and using R.

Installing R

R Installation

To install R, go to the Comprehensive R Archive Network CRAN. At the time of writing, it looks as follows:

R Installation on Windows

To download, for instance, R for Windows, you arrive at the following page:

R Installation on Windows

Now download the latest version of R (version 4.5.0 at the time of writing):

R Installation on Windows

Start the download process:

R Installation on Windows

Continue the download process:

R Installation on Windows

Download completed:

R Installation on Mac

Choose the right version for your Mac:

R Installation on Mac

Download the installation file and open it:

R Installation on Mac

Simply click continue everywhere, the default installation is fine.

  • When asked, enter your password and accept the license agreement.

First Steps in R

Opening R

Open R, you should see the following:

Opening R

The program window provides some basic information on R and the installed version.

Check your version of R!

R Console

The R Console can be used to directly give in commands and display output:

R Prompt

Under the default information, you can see the R prompt, through which R indicates that it is ready to execute a new command.

Executing your first command

Start by using R as a simple calculator, try to enter after the R prompt:

2+3

and hit enter.

You will see that R directly returns the output in the R console next to the ‘[1]’:

2+3
[1] 5

If you want to re-execute your previous command, use the arrow key . To make changes to a command, you can use the ←, → arrows and re-execute. To move to the next command, use the arrow.

Finally, when R receives an incomplete expression such as

2+

R will return the + symbol, thereby letting you know that you forgot to type something. You can either complete the command and then hit enter or exit the incomplete command via the esc button.

Exercise 1.1

Enter the following expressions, one by one, in R and hit enter to see how R evaluates them:

2 + 3*4

(2 + 3)*4

(2 + 3*4

sqrt(4)

pi

From R to Rstudio

After taking your first steps in R, you might not be impressed by the design of the user interface and the way the software is used.

There are various user interfaces that work on top of plain R to make it more user friendly. A very popular one is RStudio.

What is RStudio?

What is RStudio?

RStudio

  • is an Integrated Development Environment (IDE) for R

  • is a user-friendly interface for writing and running R code

  • makes it easier to write code, analyze data, create visuals, access documentation and manage projects

Its homepage is https://posit.co/download/rstudio-desktop/

Installing RStudio

RStudio Installation on Windows

To download RStudio, go to its download page. At the time of writing, it looks as follows:

RStudio Installation on Windows

Start the download process:

RStudio Installation on Mac

If you open the (same) download page on Mac, you directly see the download button for Mac:

RStudio Installation on Mac

Open the downloaded RStudio dmg file, and drag the RStudio icon into the applications folder:

First Steps in RStudio

Opening RStudio

Open RStudio, you should see the following:

Opening RStudio

You will recognize the R Console and see that RStudio is ready to receive input:

Exercise 1.2

Re-execute the commands you executed before in R now in the R Console of RStudio:

2 + 3*4

(2 + 3)*4

(2 + 3*4

sqrt(4)

pi

The console in RStudio behaves exactly the same as the plain R window!

Structure in RStudio

But RStudio has much more to offer than plain R!

RStudio (unlike R) is structured in different windows. You should currently see 3 windows:

  • Left: R console
  • Top right: Display of objects in the global environment
  • Bottom right: Files, plots, packages, help, etc.

RStudio Windows

The 3 main windows (aka panes) in RStudio:

RStudio Menu Bar

RStudio also has a Menu bar at the top:

The usefulness of these windows and menu bar in RStudio will become clear throughout this training.

Exercise 1.3

Explore the following quick tricks that the console in RStudio offers. After the prompt:

  • press the ctrl button on Windows, or the command on Mac together with the arrow key . What do you see?

It should give you a list of all previously executed commands. You can then use the and arrow keys to directly move to a certain command to repeat or correct it.

  • start to write sq and then hit the tabulator (tab) key right after without an additional space. What do you see?

It should present you with a list of suggested commands together with a short description. You can then use the and arrow keys to navigate through them. Which one is the relevant one to compute the square root of a number? Choose the appropriate function, then hit enter to compute the square root of 100.

Need for R Scripts

Suppose you want to continue your work tomorrow. If you would now close RStudio, all of your work would be gone!

To avoid this problem, we will not give commands directly in the R console, but save them in an R Script.

R Script

What is an R Script?

While many simple calculations can be done using the command line, as soon as things get more complicated, scripts should be used.

An R Script

  • is a plain text file that contains a collection of R commands

  • is written in such a form to perform commands in a step by step fashion

  • contains all commands including those for importing data, analyzing data, visualizing data or other tasks

  • can be saved as a .R file

  • allows you to automate and reproduce your work, allowing you to run the same analysis without having to redo each step manually again

  • makes your work organized, shareable and transparent!

Creating an R Script

Creating an R Script is easy.

Using the menu bar, go to File -> New File -> R Script:

Alternatively, you could have used the shortcut ctrl+shift+N on Windows or command+shift+N on Mac.

Structure in RStudio

RStudio now displays 4 windows, with the R Script currently displayed in the top left:

R Script

Currently the R Script is Untitled, we should give it a name.

Exercise 1.4

Give your R Script a name and save it. Go to File -> Save As…:

R Working Directory

Before we start working in our R Script, let us first set the working directory in R.

The working directory is the location on your computer where R will read and save files. Go to Session -> Set Working Directory. Two convenient options are:

  • Choose Directory…: Choose the directory yourself

  • To Source File Location: Set the working directory to the directory where your R Script (the source file) is saved

R Working Directory

Select the option to set the working directory to the source file location. You will see that a command automatically pops up in the R console, something like:

setwd("C:/Rtraining")

You can use this command directly next time. Mind the usage of the / when specifying the path of your working directory!

You can also include comment lines in your R script. These start with the symbol # and allow you to document your code. For instance:

# Setting my working directory
setwd("C:/Rtraining")

Exercise 1.5

Copy some of the commands you previously executed into your R Script.

Execute a specific line

  • by pressing the Run button to execute the line on which your cursor is currently located

  • by using ctrl+enter on Windows or command+enter on Mac to execute the line on which your cursor is currently located

You can execute multiple lines by highlighting them all and doing one of the above. Try this.

Exercise 1.6

You can adjust the sizes and positioning of the windows (panes) according to your liking.

Go to Tools, Global Options… and then select Pane Layout from which you can adjust the positioning of each of the 4 panes by selecting them from the drop-down menus

Experiment with a different positioning and re-adjust according to your liking.

R Package Management System

R Package Management System

Until now, we have explored some basic functionality in R. But much of the functionality of R is extended by its big and active community.

These extensions are called R packages

What are R packages?

R packages

  • are constantly developed and adjusted by the large user community making many state-of-the-art methods quickly available

  • can be installed for just about everything you want to do in statistics and data science

  • offer pre-defined functions which makes it possible to use R without a deep understanding of the language and programming skills.

Overview of R packages

The standard distribution of R comes already with a number of packages.

A list of the currently installed packages can be obtained from the Packages window in the bottom right window below

You can install new packages depending on your needs. We will explore R packages in the upcoming sessions!