I need assistance in a faster distance matrix tool for approximately 107000 source points and 220 destination points.

I currently use Open route services on docker, it generates memory error if I run the whole thing, so I run it in a loop i.e. dividing 107000 source points in each loop. This will take more than 2 hours when I am calculating distance.

Same process takes 6 minutes in R when I run the script.

Why is that? How can I make the python one faster? Are there any alternatives?

#setting the index to find the row for where paddocks start
paddockidx = site_and_point_coords.loc[site_and_point_coords["site_or_paddock"] == "paddock"].index[0]

#generating all the coords for both sites and paddocks
coordinates = list(zip(site_and_point_coords.lon.values, site_and_point_coords.lat.values))

# generating destinations i.e. site locations for route_matrix
destinations = [i for i in range(paddockidx)]

#generating sources i.e. paddocks for route_matrix
sources = [i for i in range(paddockidx, len(coordinates))]

# running the distance_matrix, client is defined in the import cell / chunk
# client is connected to local docker, i.e. i am sending request to local docker container and not online

coordinates = coordinates
sources = sources
destinations = destinations

matrix = client.distance_matrix(
    locations= coordinates, 
    sources = sources[:len(sources)//20], 
    destinations= destinations, 
    profile='driving-car', 
    metrics=['distance'], 
    units = 'km',
    validate = True)
0

There are 0 best solutions below