Efficiently Turn Matrix of Intersecting Routes Into Simplified Spatial Network (Graph)

62 Views Asked by At

I have a matrix of real detailed routes which I want to efficiently turn into a simple spatial network. Simple means that I don't care about the intricacies of local transport and possible intersections of routes near the start and end points. I do want to add major intersections outside the start and end points as nodes to the network. Below I give a simple example. My real data has 12,500 routes between 500 start and end points and is approx. 2Gb in size.

library(fastverse)
#> -- Attaching packages --------------------------------------- fastverse 0.3.2 --
#> v data.table 1.15.0     v kit        0.0.13
#> v magrittr   2.0.3      v collapse   2.0.12
fastverse_extend(osrm, sf, sfnetworks, install = TRUE)
#> -- Attaching extension packages ----------------------------- fastverse 0.3.2 --
#> v osrm       4.1.1      v sfnetworks 0.6.3 
#> v sf         1.0.16

largest_20_german_cities <- data.frame(
  city = c("Berlin", "Stuttgart", "Munich", "Hamburg", "Cologne", "Frankfurt",
    "Duesseldorf", "Leipzig", "Dortmund", "Essen", "Bremen", "Dresden",
    "Hannover", "Nuremberg", "Duisburg", "Bochum", "Wuppertal", "Bielefeld", "Bonn", "Muenster"),
  lon = c(13.405, 9.18, 11.575, 10, 6.9528, 8.6822, 6.7833, 12.375, 7.4653, 7.0131,
          8.8072, 13.74, 9.7167, 11.0775, 6.7625, 7.2158, 7.1833, 8.5347, 7.1, 7.6256),
  lat = c(52.52, 48.7775, 48.1375, 53.55, 50.9364, 50.1106, 51.2333, 51.34, 51.5139,
          51.4508, 53.0758, 51.05, 52.3667, 49.4539, 51.4347, 51.4819, 51.2667, 52.0211, 50.7333, 51.9625))

# Unique routes
m <- matrix(1, 20, 20)
diag(m) <- NA
m[upper.tri(m)] <- NA
routes_ind <- which(!is.na(m), arr.ind = TRUE)
rm(m)

# Routes DF
routes <- data.table(from_city = largest_20_german_cities$city[routes_ind[, 1]], 
                     to_city = largest_20_german_cities$city[routes_ind[, 2]], 
                     duration = NA_real_, 
                     distance = NA_real_, 
                     geometry = list())
# Fetch Routes
i = 1L
for (r in mrtl(routes_ind)) {
  route <- osrmRoute(ss(largest_20_german_cities, r[1], c("lon", "lat")),
                     ss(largest_20_german_cities, r[2], c("lon", "lat")), overview = "full")
  set(routes, i, 3:5, fselect(route, duration, distance, geometry))
  i <- i + 1L
}
routes %<>% st_as_sf(crs = st_crs(route))

routes_net = as_sfnetwork(routes, directed = FALSE)
print(routes_net)
#> # A sfnetwork with 20 nodes and 190 edges
#> #
#> # CRS:  EPSG:4326 
#> #
#> # An undirected simple graph with 1 component with spatially explicit edges
#> #
#> # A tibble: 20 × 1
#>              geometry
#>           <POINT [°]>
#> 1  (9.179999 48.7775)
#> 2      (13.405 52.52)
#> 3 (11.57486 48.13675)
#> 4 (10.00001 53.54996)
#> 5   (6.95285 50.9364)
#> 6   (8.68202 50.1109)
#> # ℹ 14 more rows
#> #
#> # A tibble: 190 × 7
#>    from    to from_city to_city duration distance                       geometry
#>   <int> <int> <chr>     <chr>      <dbl>    <dbl>               <LINESTRING [°]>
#> 1     1     2 Stuttgart Berlin      390.     633. (9.179999 48.7775, 9.18005 48…
#> 2     2     3 Munich    Berlin      356.     586. (11.57486 48.13675, 11.57486 …
#> 3     2     4 Hamburg   Berlin      176.     288. (10.00001 53.54996, 10.0002 5…
#> # ℹ 187 more rows
plot(routes_net)

Created on 2024-03-28 with reprex v2.0.2

Regarding possible solutions I am open to any software (R, Python, QGIS etc.). I know in R there is tidygraph which allows me to do something like

library(tidygraph)
routes_net_subdiv = convert(routes_net, to_spatial_subdivision)

But this seems to run forever even with this mock example. I have also seen ideas to use GRASS's v.clean tool to break up the geometry, but haven't tried that yet and a bit reluctant to install GRASS.

I think perhaps the best solution for performance is converting to S2 and comparing all linestrings individually using s2_intersection() and then turning this information into a graph somehow. But hoping for more elegant and performant solutions.

1

There are 1 best solutions below

4
Timeless On

I am open to any software (R, Python, QGIS etc.)

With and IIUC, you can use this primal approach after getting the available directions :

cities = gpd.GeoDataFrame(
    pd.DataFrame(data), crs="EPSG:4326",
    geometry=gpd.points_from_xy(data["lon"], data["lat"])
)

client = openrouteservice.Client(key="") # put here your key

def get_route(coords, as_ls=True):
    res = client.directions(coords, profile="driving-car", format="geojson")
    geom = res["features"][0]["geometry"]
    return shape(geom) if as_ls else geom

combs_lonlat = combinations(cities[["lon", "lat"]].agg(tuple, axis=1), 2)

# maybe we should use a standard loop and add time.sleep(n) ?
lines = [get_route(c) for c in combs_lonlat] # will eventually show warnings

routes = gpd.GeoDataFrame(geometry=lines, crs=4326)

NB: You'll need to signup to openrouteservice in order to generate/get your api key.

Now, to make the graph, you can use momepy :

splines = set()
for ll, rl in combinations(routes["geometry"], 2):
    try:
        for line in split(ll, rl).geoms:
            splines.add(line)
    except ValueError:
        pass

G = gdf_to_nx(
    gpd.GeoDataFrame(geometry=list(splines)),
    multigraph=False, directed=False,
    approach="primal",
)

len(G.nodes) # 250
len(G.edges) # 1296

Plot (optional) :

fig, (axl, axr) = plt.subplots(1, 2)

for ax, inp in zip((axl, axr), (cities, G)):
    routes.plot(color="k", ax=ax)
    cities.plot(color="r", marker="x", zorder=3, ax=ax)
    _ = ax.set_title(f"{len(inp)} nodes", fontweight="bold")
    _ = ax.axis("off")

for x, y, name in zip(
    cities["geometry"].x,
    cities["geometry"].y,
    cities["city"]
):
    for ax in (axl, axr):
        _ = ax.annotate(
            name, xy=(x, y), xytext=(3, 3), textcoords="offset points"
        )

(gpd.GeoSeries([Point(n) for n in G.nodes], crs=4326)
     .plot(color="b", marker="o", ax=axr))

enter image description here

Used imports/input :

import pandas as pd
import geopandas as gpd
import openrouteservice
import matplotlib.pyplot as plt
from itertools import combinations
from shapely.geometry import shape, Point
from shapely.ops import split
from momepy import gdf_to_nx

data = {
    "city": [
        "Berlin", "Stuttgart", "Munich", "Hamburg", "Cologne",
        "Frankfurt", "Duesseldorf", "Leipzig", "Dortmund", "Essen",
        "Bremen", "Dresden", "Hannover", "Nuremberg", "Duisburg",
        "Bochum", "Wuppertal", "Bielefeld", "Bonn", "Muenster"
    ],
    "lon": [
        13.405, 9.18, 11.575, 10, 6.9528, 8.6822, 6.7833, 12.375,
        7.4653, 7.0131, 8.8072, 13.74, 9.7167, 11.0775, 6.7625, 7.2158,
        7.1833, 8.5347, 7.1, 7.6256
    ],
    "lat": [
        52.52, 48.7775, 48.1375, 53.55, 50.9364, 50.1106, 51.2333, 51.34,
        51.5139, 51.4508, 53.0758, 51.05, 52.3667, 49.4539, 51.4347, 51.4819,
        51.2667, 52.0211, 50.7333, 51.9625
    ]
}