I have a dataframe like this
| index | place id | var_lat_fact | var_lon_fact |
|---|---|---|---|
| 0 | 167312091448 | 5.6679820000 | -0.0144950000 |
| 1 | 167312091448 | 5.6686320000 | -0.0157910000 |
| 2 | 167312091448 | 5.6653530000 | -0.0181980000 |
| 3 | 167312091448 | 5.6700970000 | -0.0191400000 |
| 4 | 167312091448 | 5.6689810000 | -0.0104040000 |
For each coordinates pair (lat, lon) I'd like to calculate the euclidean distance to the nearest neighbour within the dataframe. So each point gets a metric in the additional column (say, nearest_neighbour_dist) indicating that distance in meters.
Something like this
| index | place id | var_lat_fact | var_lon_fact | nearest_neighbour_dist |
|---|---|---|---|---|
| 0 | 167312091448 | 5.6679820000 | -0.0144950000 | 160.588370 |
| 1 | 167312091448 | 5.6686320000 | -0.0157910000 | 160.588370 |
| 2 | 167312091448 | 5.6653530000 | -0.0181980000 | 451.525301 |
| 3 | 167312091448 | 5.6700970000 | -0.0191400000 | 404.794908 |
| 4 | 167312091448 | 5.6689810000 | -0.0104040000 | 466.104453 |
Just can't get my head around this... Any help would be greatly appreciated.
You can use sklearn's
NearestNeighbors:Output:
Points on a map
I wanted to double check the validity of the computations
1 -> 2 (index 0-> 1 in your data) is indeed about 160.6 meters