I have a performance problem with a SQL query running on SQL Server 2019 that returns the lat/long from a geography column.
My query is as follows and it takes around 5 seconds to return 553 rows:
SELECT ActivityLocations.ID, ActivityLocations.ActivityID, ActivityLocations.Number, ActivityLocations.Location.Lat AS 'Latitude', ActivityLocations.Location.Long AS 'Longitude'
FROM Plans
INNER JOIN Activities ON Plans.ID = Activities.PlanID
INNER JOIN ActivityLocations ON Activities.ID = ActivityLocations.ActivityID
WHERE CustomerID = 35041
The query plan it generates is:

But if I change the query a little to return a bit less data, it takes 0 seconds to return 207 rows:
SELECT ActivityLocations.ID, ActivityLocations.ActivityID, ActivityLocations.Number, ActivityLocations.Location.Lat AS 'Latitude', ActivityLocations.Location.Long AS 'Longitude'
FROM Plans
INNER JOIN Activities ON Plans.ID = Activities.PlanID
INNER JOIN ActivityLocations ON Activities.ID = ActivityLocations.ActivityID
WHERE PlanID > 22486
I guess my question, is why is the Compute Scalar operation happening before the join on the slow query and after the join on the fast query? I don't understand why it would do the Lat/Long operation on every row the activity locations table when we only want a small subset of the rows?
Any help would be greatly appreciated.
Edited to contain table information
CREATE TABLE [dbo].[Activities](
[ID] [int] NOT NULL,
[PlanID] [int] NOT NULL,
[Name] [nvarchar](255) NOT NULL,
CONSTRAINT [PK_Activity] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[ActivityLocations](
[ID] [int] NOT NULL,
[ActivityID] [int] NOT NULL,
[Number] [int] NOT NULL,
[Location] [geography] NOT NULL,
CONSTRAINT [PK_ActivityLocations] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
CREATE TABLE [dbo].[ActivityPlans](
[ID] [int] NOT NULL,
[CustomerID] [int] NOT NULL,
[PurchaseOrder] [nvarchar](255) NULL,
[Deleted] [bit] NOT NULL,
[Name] [nvarchar](500) NULL,
CONSTRAINT [PK_ActivityPlan] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [PlanID_IX] ON [dbo].[Activities]
(
[PlanID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [ActivityID_IX] ON [dbo].[ActivityLocations]
(
[ActivityID] ASC
)
INCLUDE([Number],[Location]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [CustomerID_NCIX] ON [dbo].[ActivityPlans]
(
[CustomerID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]

NOTE: OP changed the name of the table
PlanstoActivityPlansbetween the originally posted queries/graphical executions plans and his subsequently provided execution plan (via paste the plan) and DDL scripts. I am going to speak in terms of the tablePlans(as it was originally posted).I am going to try to explain this in its entirety, including trying to note the things that are not an issue between these two queries because the obvious differences when looking at the two queries is not resulting in the issue (per se). Read on to see my explanation.
Similarities
First, lets talk about where the two execution plans are similar. It is important to note the similarities because (assuming second query plan is acceptable for the OP) the issues is not with the similarities between the two query plans.
PlansandActivities.WHEREclause. Which is good. The optimizer makes a good decision and is able to use an index seek in both cases. The filterWHERE CustomerID = 35041is resolved to an index seek on the indexCustomerID_NCIXof thePlanstable and the filterWHERE PlanID > 22486is resolved to an index seek on the indexPlanID_IXof theActivitiestable. Then the join to the subsequent table (Activitiesin the first query andPlansin the second query) is done. It is supported by an index in both cases, the estimates for the seek operations are not horrific, and both joins are done using Nested Loops that output a number that is relatively close to the final result set. So even though the only visual difference in these two queries is the difference in theWHEREclause, it appears theWHEREclause in each query is being handled fairly similarly and does not appear to be an issue.ActivityLocationsusing the indexActivityID_IX.Compute Scalaroperator that retrieves the required values for your expressionsActivityLocations.Location.LatandActivityLocations.Location.LongfromActivityLocations.Locationof yourSELECTstatement.Differences
Now lets talk about the (important) differences, which is where the issues lie.
ActivityLocationsusing an Index Seek operator, while the second query uses an Index Scan operator.ActivityLocationsof the first query has an Actual/Estimated Row Count of 329,475/331,523 and the Index Seek operator that accesses the tableActivityLocationsof the second query has an Actual/Estimated Row Count of 207/9.PlansandActivities) and the second query uses a Nested Loop Join.What is the actual problem?
Put simply, the first query is reading much more data when we look at the execution plan. The rows read from the table
ActivityLocationsof roughly 300k in the first query is much higher than the 207 rows read in the second query. Additionally, the Compute Scalar operator of the first query is required to compute values for (the same) roughly 300k rows as opposed to the 207 rows of the second query. This obviously leads to a longer running query.It is also worth noting that the larger row counts coming from the table
ActivityLocationsis the reason for the Merge Join (seen in the first query plan) in place of the Nested Loop Join operator (seen in the second query plan). According to the optmizer, given your environment, a Merge Join is more suitable to join 300k rows to 3.3k rows than a Nested Loop Join would be. And the use of a Merge Join requires both sides of the join to be sorted by the join columns, thus the reason for the additional Sort operator in the query plan of the first query.Why is it happening?
Estimates. Estimates drive the optimizer's decision making. In the first query we see that the estimated number of rows to read from the table
ActivityLocations(from the Index Scan) is 331,523 and in the second query (from the Index Seek) we see an estimate of 9. It might seem odd to say, but these estimates are closer than you might think. An Index Scan (on up to date statistics) is usually going to have row estimates equivalent to the rows in the table (except for filtered indexes). An Index Seek would ideally estimate a lower number of rows than what is contained in the table. Ideally that number would match the actual number of rows that the Index Seek is required to touch, but the fact that your Index Seek estimate is lower than the entire table is a step in the right direction.So if the issue is not with the estimates in the Index Scan or Index Seek then where is it? The issue is in the choice to access the table
ActivityLocationsusing an Index Scan in the first query over the choice to use an Index Seek. So why does the first query choose an Index Scan? It is quite clear by looking at the execution plan that an Index Seek would have been a better option. I believe the answer in this case is the cardinality estimation, specifically in this case, the cardinality estimation used for the join to the tableActivityLocations.We see that the estimated number of rows in the output of the final join of the first query increases from its input Sort operator (3341->3402) while the estimated number of rows in the output of the final join of the second query remains consistent from its input Nested Loop operator (207->207). And not only did the optimzer estimate this, it was right. The actual rows counts returned from these same operators reflect the same pattern.
Why does that matter? What it means is that based on the optimizer's estimate, the join to the table
ActivityLocationsis going to increase the row count of the input result set. Meaning this join is going to be 1 (input row) to many (output row). Keep in mind, the optimizer needs to return your requested valuesActivityLocations.Location.LatandActivityLocations.Location.Longfrom the tableActivityLocations. So when it considers this join, that it believes is going to increase the rows it plans to output from accessing the tableActivityLocationswhile keeping in mind that it needs to perform a Compute Scalar on columns output from that table, it would make sense to run the Compute Scalar prior to running the join because if the Compute Scalar is run before the join it can guarantee that the Compute Scalar is only running once per row ofActivityLocations, but it cannot guarantee that if the Compute Scalar is run after the join. In this case, the join is actually what ends up limiting the rows fromAcitivityLocationsand the number of rows returned from that table (for the purposes of this query) is much lower than the row count of the table. In the second query, the estimate says that the output number of rows will be the same, so running the Compute Scalar after the join makes no difference to the number of rows where the compute will be required, so it makes sense to perform the Index Seek in place of the Index Scan.So to summarize, the rows returned from the first two tables of your first and second query (using your
WHEREclause) are different. And it is likely that the rows returned from the first query resulted in a join estimation that estimated a different cardinality than the second query. Thus the differences in how the query plan was built and how it was subsequently run.Cardinality estimation (with joins specifically) is made up of several factors. If you really want to get in to the internals I would recommend these two articles from the legend Paul White and SQL Shack. The things discussed there should guide you on how you can review the estimation within your system.
How to fix it?
The first goal would be to improve the estimate. If the estimated cardinality for the join was incorrect (which is actually not the case here) then updating the statistics might help. Out of date statistics could lead to a bad cardinality estimate, and thus a bad query plan.
In some cases you might be able to restructure your query to a logical equivalent to result in a better execution plan. This might be by writing it to produce better estimates or even possibly returning different rows in different orders. In this case, I would say the first query looks fine to me, and thus rewriting the query for a logical equivalent would likely not help.
In this case, the cardinality estimate is correct (and you mentioned that you updated statistics and it did not help), the query appears to be written in a good fashion, but the execution plan chosen is still sub-optmial. So I would recommend a query hint. This problem would easily be solved with a query hint to seek the index
ActivityID_IXofActivityLocations. Your join in your first query would look like so:There is plenty of information out there on why query hints might be a bad idea, but given the information I have here I would say that is the best option. I am always open to other opinion. Cheers!