(Optional) Part V: Working with Dates and Times (20 minutes)
Handling date and time data in R
R provides the Date class for dates and the POSIXct and POSIXlt classes for times.
# Converting a string to a Date object
date_example <- as.Date("2021-01-01")
print(date_example)
#> [1] "2021-01-01"
# Converting a string to a POSIXct datetime object
datetime_example <- as.POSIXct("2021-01-01 10:00:00", tz = "GMT")
print(datetime_example)
#> [1] "2021-01-01 10:00:00 GMT"
Common date and time functions
# Extracting parts of a date
year <- format(date_example, "%Y")
month <- format(date_example, "%m")
day <- format(date_example, "%d")
print(paste("Year:", year, "- Month:", month, "- Day:", day))
#> [1] "Year: 2021 - Month: 01 - Day: 01"
# Working with time intervals
start_time <- as.POSIXct("2021-01-01 08:00:00", tz = "GMT")
end_time <- as.POSIXct("2021-01-01 10:00:00", tz = "GMT")
time_diff <- difftime(end_time, start_time, units = "hours")
print(paste("Difference in hours:", time_diff))
#> [1] "Difference in hours: 2"
# Loading a dataset with date and time data for exercises
# Using the 'airquality' dataset from the 'datasets' package
data(airquality)
airquality$Date <- as.Date(with(airquality, paste(1973, Month, Day, sep = "-")))
print(head(airquality))
#> Ozone Solar.R Wind Temp Month Day Date
#> 1 41 190 7.4 67 5 1 1973-05-01
#> 2 36 118 8.0 72 5 2 1973-05-02
#> 3 12 149 12.6 74 5 3 1973-05-03
#> 4 18 313 11.5 62 5 4 1973-05-04
#> 5 NA NA 14.3 56 5 5 1973-05-05
#> 6 28 NA 14.9 66 5 6 1973-05-06
Exercises:
data(airquality)
airquality$Date <- as.Date(with(airquality, paste(1973, Month, Day, sep = "-")))
print(head(airquality))
#> Ozone Solar.R Wind Temp Month Day Date
#> 1 41 190 7.4 67 5 1 1973-05-01
#> 2 36 118 8.0 72 5 2 1973-05-02
#> 3 12 149 12.6 74 5 3 1973-05-03
#> 4 18 313 11.5 62 5 4 1973-05-04
#> 5 NA NA 14.3 56 5 5 1973-05-05
#> 6 28 NA 14.9 66 5 6 1973-05-06
- Convert the ‘Date’ column in the ‘airquality’ dataset to a week day and create a new column ‘WeekDay’.
airquality$WeekDay <- weekdays(airquality$Date)
print(head(airquality))
#> Ozone Solar.R Wind Temp Month Day Date WeekDay
#> 1 41 190 7.4 67 5 1 1973-05-01 Tuesday
#> 2 36 118 8.0 72 5 2 1973-05-02 Wednesday
#> 3 12 149 12.6 74 5 3 1973-05-03 Thursday
#> 4 18 313 11.5 62 5 4 1973-05-04 Friday
#> 5 NA NA 14.3 56 5 5 1973-05-05 Saturday
#> 6 28 NA 14.9 66 5 6 1973-05-06 Sunday
- Calculate the number of days between the first and last measurements in the ‘airquality’ dataset.
0.6 Create and format dates
To create a Date object from a simple character string in R, you can use the as.Date() function. The character string has to obey a format that can be defined using a set of symbols (the examples correspond to 13 January, 1982):
%Y
: 4-digit year (1982)
%y
: 2-digit year (82)
%m
: 2-digit month (01)
%d
: 2-digit day of the month (13)
%A
: weekday (Wednesday)
%a
: abbreviated weekday (Wed)
%B
: month (January)
%b
: abbreviated month (Jan)
For more information and a full list use ?strptime
0.7 as.Date()
as.Date('2019-06-05',format = '%Y-%m-%d')
#> [1] "2019-06-05"
Dates are often stored as integers.
Convert integers to dates by speciying the origin (Day 0).
For example: SAS stores dates at the number of days elapsed since 1 Jan 1960.
as.Date(21705, origin = '1960-01-01')
#> [1] "2019-06-05"
Exercise 1
Work with the policy_data data set. 1. Convert the start date (debut_pol) and end date (fin_pol) into R Date objects.
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
policy_data <- read.csv(file = 'John Jay Workshop Data/PolicyData.csv', sep = ';')
policy_data$start <- as.Date(policy_data$debut_pol, '%d/%m/%Y')
policy_data$end <- as.Date(policy_data$fin_pol, '%d/%m/%Y')
head(policy_data %>% select(c('debut_pol', 'start')))
#> debut_pol start
#> 1 14/09/1995 1995-09-14
#> 2 25/04/1996 1996-04-25
#> 3 1/03/1995 1995-03-01
#> 4 1/03/1996 1996-03-01
#> 5 15/01/1997 1997-01-15
#> 6 1/02/1997 1997-02-01
format()
today <- as.Date('2019-06-05',
format = '%Y-%m-%d')
format(today, '%A %d %B %Y')
#> [1] "Wednesday 05 June 2019"
Calculate the duration of a contract.
policy_duration =
policy_data$end - policy_data$start
You can add and subtract integers from dates.
tomorrow = today + 1
print(tomorrow)
#> [1] "2019-06-06"
0.8 Access date components
# install.packages("lubridate")
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
year(today)
#> [1] 2019
Other components are: month(), day(), quarter(), …
0.9 Advanced math
today + months(3)
#> [1] "2019-09-05"
Other periods are: years() and days().
floor_date(today, unit = "month")
#> [1] "2019-06-01"
floor_date rounds down to the nearest unit.
In the example convert daily into monthly data.