distHaversine resluts are too high

135 Views Asked by At

I am trying to caculate the distance between two WGS84 points in R. but the results are too high than the real distance.

point1 <- c(34.94821,29.55805)
point2 <- c(34.79130,31.25181)

point_mat <- matrix(c(point1, point2), ncol =2 )  
distHaversine(point_mat, r=6378137)  

the result is 638845.2, but the real distance between the cities is 188,94 km

1

There are 1 best solutions below

0
Ray On

Try to understand the formatting of the input to geosphere::distHaversine() function.

In principle, you can define the points "directly" and there is no need to convert them into a matrix form, i.e. distHaversine(point1, point2). The function defaults to an earth radius in metres. Thus, you could avoid specifying the earth radius (or inject a km-related or other distance unit equivalent).

E.g.

point1 <- c(34.94821,29.55805)
point2 <- c(34.79130,31.25181)

geosphere::distHaversine(point1,point2, r=6378137)  
geosphere::distHaversine(point1,point2)   # radius defaults to metres 

Both call yield:

[1] 189149.3

in metres. Thus, ~ 189,15km which is close to your expected result.

If you use the matrix form, make sure the matrix is properly constructed.

point1
[1] 34.94821 29.55805

point2
[1] 34.79130 31.25181

matrix(c(point1, point2), ncol =2 )  
         [,1]     [,2]
[1,] 34.94821 34.79130
[2,] 29.55805 31.25181

You see that the order of longitude/latitude for the points is corrupted (for you), and the value you obtained is for "other" points (than you expect).

In this case, you would have to "load" the matrix with byrow=TRUE to keep the longitude/latitude in sync:

matrix(c(point1, point2), byrow = TRUE, ncol = 2)  
         [,1]     [,2]
[1,] 34.94821 29.55805
[2,] 34.79130 31.25181

geosphere::distHaversine(matrix(c(point1, point2), byrow = TRUE, ncol = 2)  )
[1] 189149.3

In many use-cases, we operate on data structures that list the "start" and "end" longitudes/latitudes. You can use cbind() to define the required input to distHaversine() as follows:

# simulate a dataframe of positions given by start and end points
df <- data.frame(
      LON1 = c(34.94821, 34.94821)
    , LAT1 = c(29.55805, 34.79130)
    , LON2 = c(34.79130, 34.79130)
    , LAT2 = c(31.25181, 29.55805)
)

library(dplyr) # tabular data handling and pipe

df %>% 
  mutate(DIST = geosphere::distHaversine(
                          cbind(LON1, LAT1)
                        , cbind(LON2, LAT2)
                )
)

      LON1     LAT1    LON2     LAT2     DIST
1 34.94821 29.55805 34.7913 31.25181 189149.3
2 34.94821 34.79130 34.7913 29.55805 582750.0