I'm trying to scrape this website by day. https://www.basketball-reference.com/boxscores/?month=9&day=30&year=2020 As you can see, the date is not in the right format and I can't figure out how to get R to recognize it as a date.

Here is my code so far:

url <- "https://www.basketball-reference.com/boxscores/"

timevalues <- seq(as.Date("month=10&day=2&year=2020"), as.Date("month=10&day=2&year=2020"), by = "day")
head(timevalues)

Error in charToDate(x) : character string is not in a standard unambiguous format

2

There are 2 best solutions below

0
On

We can use glue to create

library(lubridate)
url <- "https://www.basketball-reference.com/boxscores/"
my_dates <- seq(as.Date("2020-09-25"), as.Date("2020-10-05"), by = "day")
urls <- glue::glue("{url}?month={month(my_dates)}&day={day(my_dates)}",
               "&year={year(my_dates)}")

-output

#https://www.basketball-reference.com/boxscores/?month=9&day=25&year=2020
#https://www.basketball-reference.com/boxscores/?month=9&day=26&year=2020
#https://www.basketball-reference.com/boxscores/?month=9&day=27&year=2020
#https://www.basketball-reference.com/boxscores/?month=9&day=28&year=2020
#https://www.basketball-reference.com/boxscores/?month=9&day=29&year=2020
#https://www.basketball-reference.com/boxscores/?month=9&day=30&year=2020
#https://www.basketball-reference.com/boxscores/?month=10&day=1&year=2020
#https://www.basketball-reference.com/boxscores/?month=10&day=2&year=2020
#https://www.basketball-reference.com/boxscores/?month=10&day=3&year=2020
#https://www.basketball-reference.com/boxscores/?month=10&day=4&year=2020
#https://www.basketball-reference.com/boxscores/?month=10&day=5&year=2020
0
On

You can't generate sequences (or dates) like this. Here's a solution using the lubridate package

library(lubridate)

url <- "https://www.basketball-reference.com/boxscores/"

my_dates <- seq(as.Date("2020-09-25"), as.Date("2020-10-05"), by = "day")

urls <- paste0(url, 
               "?month=", month(my_dates),
               "&day=", day(my_dates), 
               "&year=", year(my_dates))

urls
#>  [1] "https://www.basketball-reference.com/boxscores/?month=9&day=25&year=2020"
#>  [2] "https://www.basketball-reference.com/boxscores/?month=9&day=26&year=2020"
#>  [3] "https://www.basketball-reference.com/boxscores/?month=9&day=27&year=2020"
#>  [4] "https://www.basketball-reference.com/boxscores/?month=9&day=28&year=2020"
#>  [5] "https://www.basketball-reference.com/boxscores/?month=9&day=29&year=2020"
#>  [6] "https://www.basketball-reference.com/boxscores/?month=9&day=30&year=2020"
#>  [7] "https://www.basketball-reference.com/boxscores/?month=10&day=1&year=2020"
#>  [8] "https://www.basketball-reference.com/boxscores/?month=10&day=2&year=2020"
#>  [9] "https://www.basketball-reference.com/boxscores/?month=10&day=3&year=2020"
#> [10] "https://www.basketball-reference.com/boxscores/?month=10&day=4&year=2020"
#> [11] "https://www.basketball-reference.com/boxscores/?month=10&day=5&year=2020"