I have a SQL query which is giving 2 different execution plan when the selection of language changes from "English,Portuguese" to "Dutch,Portuguese". The selection for "Dutch,Portuguese" works fine but not for "English,Portuguese". I can see hash join has been picked by optimizer for "English,Portuguese" and even a different index is chosen for two different cases. Please suggest if there is anyway to fix it.
QUERY PLAN for poor performing Query "English,Portuguese"
---------------------------------------------------------
Hash Join (cost=17528.25..38211.18 rows=36 width=88) (actual time=199235.583..199237.334 rows=1 loops=1)
Hash Cond: (("PC0".Referencekey)::text = ("Case".Primarykey)::text)
-> Index Scan using bulkprocessfrombasket on TAB1 "PC0" (cost=0.68..20630.90 rows=20078 width=113) (actual time=0.023..237.373 rows=22012 loops=1)
Index Cond: ((OpId)::text = 'CustomerService2ndLine'::text)
Filter: ((ObjType)::text = 'Assign-WorkBasket'::text)
-> Hash (cost=17288.84..17288.84 rows=19099 width=35) (actual time=198974.541..198974.542 rows=178642 loops=1)
Buckets: 262144 (originally 32768) Batches: 1 (originally 1) Memory Usage: 14450kB
-> Index Scan using Laguageindex on tip_cs_work "Case" (cost=0.43..17288.84 rows=19099 width=35) (actual time=0.050..198698.998 rows=178642 loops=1)
Index Cond: (((Laguage)::text = ANY ('{English,Portuguese}'::text[])) AND ((Laguage)::text = 'Portuguese'::text))
Filter: ((ObjType)::text ~~ 'Tran-TIP-CS-Work%'::text)
Planning Time: 4.292 ms
Execution Time: 199237.660 ms
(12 rows)
QUERY PLAN for Acceptable performing Query for "Dutch,Portuguese"
-----------------------------------------------------------------
Gather (cost=1001.24..42048.70 rows=46 width=88) (actual time=1563.109..1570.671 rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop (cost=1.24..41044.10 rows=19 width=88) (actual time=1525.785..1547.967 rows=0 loops=3)
-> Parallel Index Scan using bulkprocessfrombasket on TAB1 "PC0" (cost=0.68..20475.40 rows=8362 width=113) (actual time=5.513..1021.457 rows=7342 loops=3)
Index Cond: ((OpId)::text = 'CustomerService2ndLine'::text)
Filter: ((ObjType)::text = 'Assign-WorkBasket'::text)
-> Index Scan using pca_work_pk1 on tip_cs_work "Case" (cost=0.56..2.46 rows=1 width=35) (actual time=0.070..0.070 rows=0 loops=22025)
Index Cond: ((Primarykey)::text = ("PC0".Referencekey)::text)
Filter: (((ObjType)::text ~~ 'Tran-TIP-CS-Work%'::text) AND ((Laguage)::text = ANY ('{Dutch,Portuguese}'::text[])) AND ((Laguage)::text = 'Portuguese'::text))
Rows Removed by Filter: 0
Planning Time: 1.117 ms
Execution Time: 1570.740 ms
(13 rows)
The bad plan has this silly bit in it:
Presumably that came more or less directly from your query. If laguage is equal to 'Portuguese', obviously it is also =ANY('{English,Portuguese}'). There are some redundancies that the planner can detect and ignore, but not this one. It is going to estimate each condition independently and multiply them together, getting too low of an estimated number of rows (which we do see, estimate=19099 actual=178642). If this were fixed, it would probably then choose a better plan, but we can't be sure of that with the info at hand.
Why did the analogous silliness in other case not also cause a bad plan? Presumably the row estimate is different for that case, but we can't really see it in your plans since that particular row estimate is never reported in the alternate plan.
If you want to dig into this further, you could probably provoke a plan that would give us the estimate and actual row count we need to see by doing: