getwd()
Session 4
getwd()
Navigating through the menus in RStudio is easy, (click and go) but requires using the menu every time the user runs the code.
Go to Session -> Set Working Directory. Two convenient options are:
Choose Directory…: Choose the directory yourself
To Source File Location: Set the working directory to the directory where your R Script (the source file) is saved
setwd("~/R_training")
R interacts with files in several ways.
Datasets can come in different formats.
NAME MONTH TEMP
99091 EINDHOVEN 1 10.6
99122 EINDHOVEN 2 7.1
99151 EINDHOVEN 3 10.2
99178 EINDHOVEN 4 8.9
99207 EINDHOVEN 5 18.5
99238 EINDHOVEN 6 15.0
99268 EINDHOVEN 7 15.0
99299 EINDHOVEN 8 20.7
99329 EINDHOVEN 9 22.6
99359 EINDHOVEN 10 10.0
99390 EINDHOVEN 11 10.5
99415 EINDHOVEN 12 9.7
99801 MAASTRICHT 1 9.7
99832 MAASTRICHT 2 5.9
99861 MAASTRICHT 3 9.9
99888 MAASTRICHT 4 9.0
99917 MAASTRICHT 5 15.7
99948 MAASTRICHT 6 14.1
99978 MAASTRICHT 7 14.9
100009 MAASTRICHT 8 20.5
100039 MAASTRICHT 9 21.6
100069 MAASTRICHT 10 10.3
100100 MAASTRICHT 11 10.2
100125 MAASTRICHT 12 9.2
Option 1: Using menus within RStudio is the easiest (click and go) but requires using the menu every time the user runs the code.
Option 1: Using menus within RStudio (cont’d)
Option 1: Using menus within RStudio (cont’d)
Option 1: Using menus within RStudio (cont’d)
Advice for option 1:
Advice for option 1 (cont’d):
library(readxl)
climate <- read_excel("data/climate_wide.xlsx")
import("data/climate.csv")
library(quantmod)
# Get Apple Inc. (AAPL) stock data from Yahoo Finance
getSymbols("AAPL", src = "yahoo", from = "2024-01-01", to = "2025-06-13")
# Save data as RData file
save.image("data/AAPL.RData")
Loading required package: xts
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
Loading required package: TTR
Registered S3 method overwritten by 'quantmod':
method from
as.zoo.data.frame zoo
# View the first few rows
head(AAPL)
AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted
2024-01-02 187.15 188.44 183.89 185.64 82488700 184.2904
2024-01-03 184.22 185.88 183.43 184.25 58414500 182.9105
2024-01-04 182.15 183.09 180.88 181.91 71983600 180.5876
2024-01-05 181.99 182.76 180.17 181.18 62303300 179.8628
2024-01-08 182.09 185.60 181.50 185.56 59144500 184.2110
2024-01-09 183.92 185.15 182.73 185.14 42841800 183.7941
save()
saves objects as an .RData
file.save.image()
saves a selection of objects as an .RData
file.save.image()
.save()
.long_data
, and save this list using function save
.
To create plots with R’s standard graphics
package, there are high-level and low-level plotting functions.
plot(long_data$TEMP) ## Plotting a single variable
plot(x = long_data$MONTH, y = long_data$TEMP) ## Scatter plot wrt month
Notice that function plot()
calls methods.
It will perform different operations depending on the class of the passed object. (We study the lm()
function in detail in the next session!)
AAPL$AAPL.Close
The plot()
function takes several many arguments that can change the layout of the plots. See ?par
for all graphical options; there are many!
Some examples:
col
: color of lines / pointslty
, lwd
: Line type and thicknesspch
: Point type (1-16)main
, sub
: Title, subtitlexlab
, ylab
: x and y axis labelslog
, xlog
and ylog
for logarithmic scalesxlim
, ylim
: x and y axis limits (for overriding R’s default choices)mfcol
, mfrow
: Multiple plots in one graphics window (column-wise/row-wise)lines
: Draw linesabline
: Quickly add horizontal, vertical lines, and lines using equation \(y = bx + a\)
points
: Add pointsarrows
: Add arrowstitle
: Add a titlelegend
: Add a legendtext
: Add text at \((x,y)\) coordinatesmtext
: Add text with positional specification like side=1,...,4
We want to visualize the daily temperatures in the climate data specifically for Maastricht. First, make a basic plot of temperatures in Maastricht then customise the plot in the following ways:
The title of the X-axis should say ‘Month’, the title of the Y-axis ‘Average Temperature’.
Make the plot a line plot with a blue line. (Hint: specifying the colour literally as "blue"
works)
Make the tick marks appear on the inside of the figure rather than the outside.
Calculate the average temperature.
Add a horizontal line with the average maximum temperature
You will need to consult the help file for this exercise; see this therefore more as an exercise in how to navigate R’s help system, than an exercise in plotting (which we will cover in more detail later).
You may want to ask ChatGPT for help.
You can manually save graphs of several formats.
Best practice is to save a graph through a device such as pdf or similar:
loop
that iterates over time points (months).wide_data$MAASTRICHT
xlab = "Time", ylab = "Value", main = "Adding Data Over Time", type = 'l')
# Gradually plot more and more of the data using a `for loop`
for (i in 4:nrow(wide_data)) {
plot(1:i, wide_data$MAASTRICHT[1:i], # notice index i is increasing the number of plotted points
xlab = "Time", ylab = "Value", main = "Adding Data Over Time", type = 'l')
}
Exercise 4.4 Make a continuous plot of temperatures - Use the last loop example to plot temperatures gradually. - Start with an initial number of 3 observations, as in the example. - Make sure that the range of the x and y axes match with the whole dataset in the first plot. - Within the for loop, add lines to the first plot, instead of plotting the data again. - Pause the program within the for loop to simulate “gradual” effect. - You can use ChatGBT or help functions for ?ylim
, ?Sys.sleep
ggplot2
(see book by Hadley Wickham), plotly
, Rgnuplot
,…?ggplot2::geom_line
?ggplot # a bit complicated help file
help(package = "ggplot2") # a nicer list of all layer functions, see 'geom_line'
+
for layerswide_data
is a data frame with an indexlibrary('ggplot2')
load("data/climate_wide.Rdata") # load data
wide_data$index <- 1:nrow(wide_data) # create data frame
ggplot(wide_data, aes(x = MAASTRICHT)) +
geom_histogram(bandwidth = 200)
Warning in geom_histogram(bandwidth = 200): Ignoring unknown parameters:
`bandwidth`
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
geom_line
(line plot)color
is defined in aes
below compared to the earlier slide.ggplot2::geom_line
.load("data/climate_wide.Rdata") # load data
wide_data$index <- 1:nrow(wide_data) # create data frame
ggplot(wide_data, aes(x = index, y = MAASTRICHT)) +
geom_line(color = "blue", size = 1, linewidth = 2) +
geom_line(aes(x = index, y = EINDHOVEN), linewidth = 0.3) # Adds Eindhoven data to the last plot
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
load("data/climate_wide.Rdata") # load data
wide_data$index <- 1:nrow(wide_data) # create data frame
ggplot(wide_data, aes(x = index, y = MAASTRICHT)) +
geom_line(color = "blue", size = 1, linewidth = 2) +
geom_line(aes(x = index, y = EINDHOVEN), linewidth = 0.3) + # Adds Eindhoven data to the last plot
labs(
x = "X Axis", y = "Y Axis", color = "Legend Title", # Axis labels and legend title
title = "Line Plot with Two Variables"
) +
scale_color_manual(values = c("blue", "red")) + # Custom colors
theme_minimal() # Minimal theme for a clean look
geom_point
, geom_hline
, labs
.