CROSS JOIN and STDistance to find closest point

47 Views Asked by At

I have two tables with Walmart locations, and I want to know which Walmart from one table is closest to the Walmart of the other location.

This would be the result, if possible:

Source        ClosestToSource       DistanceInMeters
-----------------------------------------------------
Walmart1      Walmart8              2637.37
Walmart2      Walmart5              2047.19
Walmart3      Walmart8              1191.31
Walmart4      Walmart7              3340.33

This is basically saying that the closest Walmart for Walmart1 in @Source is Walmart8 in @Dest.

IF OBJECT_ID('dbo.WSource', 'U') IS NOT NULL 
    DROP TABLE dbo.WSource; 

IF OBJECT_ID('dbo.WDest', 'U') IS NOT NULL 
    DROP TABLE dbo.WDest; 

CREATE TABLE dbo.WSource
(
    Id int IDENTITY,
    StoreName varchar(15),
    Lat float,
    Lon float,
    GeoLocation geography
)

CREATE TABLE WDest
(
    Id int IDENTITY,
    StoreName varchar(15),
    Lat float,
    Lon float,
    GeoLocation geography
)

INSERT INTO WSource (StoreName, Lat, Lon)
select 'Walmart1', 28.064924, -81.652854 union
select 'Walmart2', 28.073512, -81.655092 union
select 'Walmart3', 28.063939, -81.665558 union
select 'Walmart4', 28.061616, -81.630152

update WSource
set geolocation = geography::STPointFromText('POINT(' + CAST(Lon AS VARCHAR(20)) + ' ' + 
                    CAST(Lat AS VARCHAR(20)) + ')', 4326)


insert into WDest (StoreName, Lat, Lon)
select 'Walmart5', 28.078024, -81.675264 union
select 'Walmart6', 28.079128, -81.693943 union
select 'Walmart7', 28.089832, -81.642243 union
select 'Walmart8', 28.062409, -81.677583

update WDest
set geolocation = geography::STPointFromText('POINT(' + CAST(Lon AS VARCHAR(20)) + ' ' + 
                    CAST(Lat AS VARCHAR(20)) + ')', 4326)

SELECT WSource.geolocation.STDistance(WDest.geolocation)
FROM WSource
CROSS JOIN WDest

This shows all the distances from one location to the other, but that's where I got stuck. I was thinking of using RANK() function, but I'm not sure how to implement that.

Based on the results above, the closest to Walmart2 is Walmart5, but not sure how to implement it.

Or maybe there's a better way of doing this?

1

There are 1 best solutions below

3
Charlieface On

One option is an APPLY with a TOP (1).

SELECT
  WSource.StoreName AS Source,
  WDest.StoreName AS ClosestToSource,
  WDest.DistanceToSource
FROM WSource
CROSS APPLY (
    SELECT TOP (1)
      WDest.*,
      WDest.geolocation.STDistance(WSource.geolocation) AS DistanceToSource
    FROM WDest
    WHERE WDest.geolocation.STDistance(WSource.geolocation) AS DistanceToSource IS NOT NULL
    ORDER BY DistanceToSource
) WDest;

You can also use ranking functions

SELECT
  w.*
FROM (
    SELECT
      WSource.StoreName AS Source,
      WDest.StoreName AS ClosestToSource,
      WDest.geolocation.STDistance(WSource.geolocation) AS DistanceToSource,
      ROW_NUMBER() OVER (PARTITION BY WSource.Id ORDER BY WDest.geolocation.STDistance(WSource.geolocation)) AS rn
    FROM WSource
    CROSS JOIN WDest
    WHERE WDest.geolocation.STDistance(WSource.geolocation) IS NOT NULL
) w
WHERE w.rn = 1;

The performance difference would probably depend on whether you had a supporting Spatial Index: the former will perform very well if so, but very badly if not.

You may also want to check if it's faster calculating the distance from Source to Dest or vice versa.

db<>fiddle