Want to combine the following into datetime column in R

364 Views Asked by At

I have columns in the following way

date, hour minute column

The first column has year month day together. I want to combine columns 1, 4, 5 into DateTime format in R. Please help me. I have used as.date but getting error.

2

There are 2 best solutions below

1
On BEST ANSWER

For the sake of completeness, here is a different approach which uses sprintf() to create a vector of character datetimes which is then converted by as.POSIXct(). with(df1, ) is used instead of typing df1$dates, df1$hour, df1$min.

with(df1, as.POSIXct(sprintf("%s%02i%02i", dates, hour, min), tz = "UTC", format = "%Y%m%d%H%M"))
 [1] "2001-01-01 01:00:00 UTC" "2001-01-01 01:30:00 UTC" "2001-01-01 02:00:00 UTC"
 [4] "2001-01-01 02:30:00 UTC" "2001-01-01 03:00:00 UTC" "2001-01-01 03:30:00 UTC"
 [7] "2001-01-01 04:00:00 UTC" "2001-01-01 04:30:00 UTC" "2001-01-01 05:00:00 UTC"
[10] "2001-01-01 05:30:00 UTC" "2001-01-01 06:00:00 UTC" "2001-01-01 06:30:00 UTC"
[13] "2001-01-01 07:00:00 UTC" "2001-01-01 07:30:00 UTC"

Also note that the timezone has been set explicitely.

Another alternative is to use the anytime package:

with(df1, anytime::utctime(sprintf("%s %02i%02i", dates, hour, min), tz = "UTC"))
 [1] "2001-01-01 01:00:00 UTC" "2001-01-01 01:30:00 UTC" "2001-01-01 02:00:00 UTC"
 [4] "2001-01-01 02:30:00 UTC" "2001-01-01 03:00:00 UTC" "2001-01-01 03:30:00 UTC"
 [7] "2001-01-01 04:00:00 UTC" "2001-01-01 04:30:00 UTC" "2001-01-01 05:00:00 UTC"
[10] "2001-01-01 05:30:00 UTC" "2001-01-01 06:00:00 UTC" "2001-01-01 06:30:00 UTC"
[13] "2001-01-01 07:00:00 UTC" "2001-01-01 07:30:00 UTC"

Note that no format specifiers are required. Only a blank needs to be inserted between dates and hour.

And finally, the lubridate package can be used:

with(df1, lubridate::ymd_hm(sprintf("%s%02i%02i", dates, hour, min)))
 [1] "2001-01-01 01:00:00 UTC" "2001-01-01 01:30:00 UTC" "2001-01-01 02:00:00 UTC"
 [4] "2001-01-01 02:30:00 UTC" "2001-01-01 03:00:00 UTC" "2001-01-01 03:30:00 UTC"
 [7] "2001-01-01 04:00:00 UTC" "2001-01-01 04:30:00 UTC" "2001-01-01 05:00:00 UTC"
[10] "2001-01-01 05:30:00 UTC" "2001-01-01 06:00:00 UTC" "2001-01-01 06:30:00 UTC"
[13] "2001-01-01 07:00:00 UTC" "2001-01-01 07:30:00 UTC"

Data

df1 <- data.frame(dates = rep("20010101", 14L), 
                  times = NA, d = NA, 
                  hour = rep(1:7, each = 2L), 
                  min = rep(c(0L, 30L), 7L))
0
On

The following function combines 3 columns into one object of class "POSIXct".

  1. A column combining year, month and day in the format yyyymmdd;
  2. An hour column;
  3. A minutes column.

The seconds are set to zero.

toDateTime <- function(x){
  d <- as.Date(x[[1]], format = "%Y%m%d")
  ISOdatetime(format(d, "%Y"), format(d, "%m"), format(d, "%d"), 
              hour = x[[2]], min = x[[3]], sec = 0L)
}

toDateTime(df1[c(1, 4, 5)])
# [1] "2001-01-01 01:00:00 WET" "2001-01-01 01:30:00 WET"
# [3] "2001-01-01 02:00:00 WET" "2001-01-01 02:30:00 WET"
# [5] "2001-01-01 03:00:00 WET" "2001-01-01 03:30:00 WET"
# [7] "2001-01-01 04:00:00 WET" "2001-01-01 04:30:00 WET"
# [9] "2001-01-01 05:00:00 WET" "2001-01-01 05:30:00 WET"
#[11] "2001-01-01 06:00:00 WET" "2001-01-01 06:30:00 WET"
#[13] "2001-01-01 07:00:00 WET" "2001-01-01 07:30:00 WET"

Test data

In the test data that follows I have set columns and 3 to NA, since they are not used.

dates <- "20010101"
hour <- rep(1:7, each = 2)
min <- rep(c(0, 30), length.out = 14)
df1 <- data.frame(dates, times = NA, d = NA, hour, min)