MySQL subquery appears to be slowing down my main query, any tips on how to speed this up?

66 Views Asked by At

I have a simple subquery in MySQL which produces a set of codes, something like 'LPR','LFR','LFL' (more codes can be returned). Using this in my main query slows it right down, like it can take a minute or 2 to run.

I'm unable to show the full code, and there are over 2,000,000 rows in the orders table but, it's basically:

SubQuery:

(SELECT DISTINCT prefix 
 FROM `photo` 
 WHERE thedate BETWEEN '2024/02/10' AND '2024/02/20' 
   AND user = 'ABCD1')

MainQuery:

SELECT
field1,
field2
FROM `orders`
WHERE ((orders.buyers_date) BETWEEN '2024/02/01' AND '2024/02/29') AND orders.user = 'ABCD1' 
AND orders.prefix IN(
    (SELECT DISTINCT prefix FROM `photo` WHERE thedate BETWEEN '2024/02/10' AND '2024/02/20' AND user = 'ABCD1')
) 

If within the MainQuery I replace the IN with the codes directly, as in

SELECT
field1,
field2
WHERE ((buyers_date) BETWEEN '2024/02/01' AND '2024/02/29') AND orders.user = 'ABCD1' 
AND orders.prefix IN('LPR','LFR','LFL') 

The query takes around 6 seconds. I thought the subquery would have run first, then the main query. This doesn't seem to be the case.

I did try :

SET @prefs = (SELECT DISTINCT prefix FROM `photo` WHERE thedate BETWEEN '2024/02/10' AND '2024/02/20' AND user = 'ABCD1')

SELECT @prefs;

The result is "MySQL returned an empty result set (i.e. zero rows)", it's Lying!

Any ideas?

2

There are 2 best solutions below

1
Akina On

Use WHERE EXISTS:

SELECT field1, field2
FROM orders
WHERE orders.buyers_date BETWEEN '2024/02/01' AND '2024/02/29'
  AND orders.user = 'ABCD1' 
  AND EXISTS ( SELECT NULL
               FROM photo
               WHERE photo.thedate BETWEEN '2024/02/10' AND '2024/02/20' 
                 AND photo.user = 'ABCD1'
                 AND orders.prefix = photo.prefix
               );

PS. Your date literals have wrong format (from MySQL looking point) - do you use VARCHAR datatype instead of DATE?

0
Rick James On

Add these indexes -- The order of the columns in these composite orders is important:

photo:   INDEX(user, theDate, prefix)
orders:  INDEX(user, prefix, buyers_date)
orders:  INDEX(prefix, user, buyers_date)

IN ( SELECT ... ) is rarely optimal.

This may be the best way to structure your query:

SELECT  o.field1, o.field2
    FROM  ( SELECT  DISTINCT prefix
                FROM  `photo`
                WHERE  thedate BETWEEN '2024/02/10' AND '2024/02/20'
                  AND  user = 'ABCD1') 
          ) AS p
    JOIN `orders` AS o  ON o.prefix = p.prefix
    WHERE  o.buyers_date BETWEEN '2024/02/01' AND '2024/02/29'
      AND  orders.user = 'ABCD1'

(I don't know which of my recommendations for index for orders is better. By having both, the Optimizer will pick the one it thinks is better. The choice may depend on the distribution of the data, so it may change. See EXPLAIN SELECT...)

Also, I don't know whether Akina's suggestion will be better. But his formulation needs:

photo:   INDEX(user, prefix, theDate)
orders:  INDEX(user, buyers_date, prefix)