Incorporating time into shortest path calculations in R {igraph} or similar packages

41 Views Asked by At

I'm working with National Rail open data to try and find the shortest path between stations across the UK at particular times. The data I'm using to build my network looks like this:

services <- 
structure(list(
  from = c("CARLILE", "WETHERL", "BRMPTNC", "HLTWHST", 
"HYDB", "HEXHAM", "PRUDHOE", "GTSHDMC", "NWCSTLE", "MANORS", 
"CRMLNGT", "HBOLTN", "KKBY", "FAZKRLY", "RICELA", "KRKDALE", 
"SANDH", "MORFNL", "HBOLTN", "KKBY", "FAZKRLY", "RICELA", "KRKDALE", 
"SANDH", "MORFNL", "HBOLTN", "KKBY", "FAZKRLY", "RICELA", "KRKDALE", 
"SANDH", "MORFNL", "LVRPLCH", "MORFNL", "SANDH", "KRKDALE", "RICELA", 
"FAZKRLY", "KKBY", "MDNHEAD", "FURZEP", "COOKHAM", "MARLOW", 
"BORNEND", "MARLOW", "BORNEND", "COOKHAM", "FURZEP", "DARTFD", 
"BRNHRST"),
  to = c("WETHERL", "BRMPTNC", "HLTWHST", "HYDB", "HEXHAM", 
"PRUDHOE", "GTSHDMC", "NWCSTLE", "MANORS", "CRMLNGT", "MRPTHRP", 
"KKBY", "FAZKRLY", "RICELA", "KRKDALE", "SANDH", "MORFNL", "LVRPLCH", 
"KKBY", "FAZKRLY", "RICELA", "KRKDALE", "SANDH", "MORFNL", "LVRPLCH", 
"KKBY", "FAZKRLY", "RICELA", "KRKDALE", "SANDH", "MORFNL", "LVRPLCH", 
"MORFNL", "SANDH", "KRKDALE", "RICELA", "FAZKRLY", "KKBY", "HBOLTN", 
"FURZEP", "COOKHAM", "BORNEND", "BORNEND", "MARLOW", "BORNEND", 
"COOKHAM", "FURZEP", "MDNHEAD", "BRNHRST", "BXLYHTH"),
  duration = c(360, 
540, 840, 600, 600, 660, 660, 480, 60, 600, 480, 60, 180, 120, 
120, 120, 180, 120, 60, 180, 120, 120, 120, 180, 120, 60, 180, 
120, 120, 120, 180, 120, 120, 240, 120, 60, 120, 180, 120, 180, 
180, 240, 420, 420, 420, 180, 180, 240, 420, 120),
  `Scheduled Departure Time` = structure(c(1709213940, 
1709214540, 1709215440, 1709216040, 1709216700, 1709217420, 1709218140, 
1709218740, 1709218860, 1709219460, NA, 1709188080, 1709188260, 
1709188440, 1709188620, 1709188800, 1709189040, NA, 1709189880, 
1709190060, 1709190240, 1709190420, 1709190600, 1709190840, NA, 
1709249580, 1709249760, 1709249940, 1709250120, 1709250300, 1709250540, 
NA, 1709190420, 1709190720, 1709190900, 1709191020, 1709191140, 
1709191380, NA, 1709186700, 1709186880, NA, NA, NA, NA, 1709189340, 
1709189580, NA, 1709235300, 1709235480), tzone = "Europe/London", class = c("POSIXct", 
"POSIXt"))), row.names = c(NA, -50L), class = c("tbl_df", "tbl", 
"data.frame"))

I'm able to reduce the number of services down to about 260,000 once I've filtered the data to only include services departing my chosen station on a particular day of the week and within a specific time window.

If I remove all the timings and create a directed graph from the distinct connections then I can haphazardly generate a feasible route (though there are of course many routes which are not possible because of missed connections etc.)

services_distinct <-
  services %>%
  distinct(from,to)

g <- 
  graph_from_data_frame(
    services_distinct, 
    directed=TRUE)

short_paths <- 
  igraph::k_shortest_paths(
    g,
    from='CARLILE',
    to='NWCSTLE',
    mode='out',
    k=5)

$vpaths
$vpaths[[1]]
+ 9/28 vertices, named, from 4824964:
  [1] CARLILE WETHERL BRMPTNC HLTWHST HYDB    HEXHAM  PRUDHOE GTSHDMC NWCSTLE


$epaths
$epaths[[1]]
+ 8/35 edges from 4824964 (vertex names):
  [1] CARLILE->WETHERL WETHERL->BRMPTNC BRMPTNC->HLTWHST HLTWHST->HYDB    HYDB   ->HEXHAM  HEXHAM ->PRUDHOE
[7] PRUDHOE->GTSHDMC GTSHDMC->NWCSTLE

My question is how do I calculate the shortest path based on the scheduled departure timings? Allowing for waiting at intermediate stations for connections etc.? I can't find much about this in {igraph} documentation. Potential solutions I've seen involve manually traversing a series of possible shortest paths (i.e. like those given above) and assessing if they are possible based on updating arrival times but I wondered if there was any in-built functionality in {igraph} or other network packages for handling this?

0

There are 0 best solutions below