Why is Redshift query plan showing join between tables not joined in the query

49 Views Asked by At

I'm facing a query performance on a quite complex query in Redshift and as a noob in Redshift I don't understand. The query is made of several joins (INNER and LEFT) and doesn't return data in hours.

I commented/uncommented each join to understand where the bottleneck was coming from. Once the query performance was exploding, I took a look at the query plan and I noticed that Redshift was performing a join between 2 tables (respectively around 1 million records and 15 thousand records) not directly joined in the query. For this join, redshift was only using the distributed key, which in the present case generate a join close to a cartesian product. Monitoring the query execution showed that the "unwanted" join was creating a internal working table of several billions (with a b) of records

I'd like to understand why the optimizer performs this join and how to prevent it?

As additional information, I'm using external table in PARQUET format stored in S3. I also noticed that by putting a condition on one of the table of the "unwanted" join, the plan was different and the join disappeared. -- EDIT Here's a piece of the explain plan. If I understand correctly the second line shows a join using only the distribution key whereas nowhere in the query there's is a join using only this distribution key

->  XN Hash Join DS_BCAST_INNER  (cost=740000221.79..1073625281.79 rows=50000000 width=70)
  Hash Cond: ("outer".distfield = "inner".distfield)
  Remarks: Derives subplan 3
  ->  XN Partition Loop  (cost=0.00..251000060.00 rows=50000000 width=44)
        ->  XN Seq Scan PartitionInfo of schemaname.table1  (cost=0.00..60.00 rows=1 width=8)
              Filter: (((((filterfieldview)::text = 'v1'::text) AND (distfield = 26)) AND (subplan 3: ($7 = distfield)) AND (subplan 4: ($9 = distfield)) AND (subplan 5: ($13 = distfield)) AND (subplan 6: ($14 = distfield)) AND (subplan 7: ($16 = distfield)) AND (subplan 8: ($19 = distfield)) AND (subplan 9: (distfield = $22)) AND (subplan 10: (distfield = $25)) AND (subplan 11: ($28 = distfield)) AND (subplan 12: ($31 = distfield)) AND (subplan 13: ($34 = distfield)) AND (subplan 14: ($36 = distfield)) AND (subplan 15: (distfield = $40)) AND (subplan 16: ($46 = distfield)) AND (subplan 17: (distfield = $50)) AND (subplan 18: ($52 = distfield)))
        ->  XN S3 Query Scan table1  (cost=0.00..125500000.00 rows=50000000 width=36)
              ->  S3 Seq Scan schemaname.table1 location:"s3://s3location/TABLE1" format:PARQUET  (cost=0.00..125000000.00 rows=50000000 width=36)
                    Filter: ((condition1)::text = 'A'::text)
  ->  XN Hash  (cost=740000221.29..740000221.29 rows=200 width=26)
        ->  XN Subquery Scan volt_dt_11  (cost=740000219.29..740000221.29 rows=200 width=26)
              ->  XN HashAggregate  (cost=740000219.29..740000219.29 rows=200 width=26)
                    ->  XN Hash Join DS_DIST_BOTH  (cost=650000097.51..740000217.56 rows=347 width=26)
                          Outer Dist Key: "inner".derived_col1
                          Inner Dist Key: ref_darwin_lookup.derived_col1
                          Hash Cond: (("outer".derived_col1 = "inner".derived_col1) AND ("outer".distfield = "inner".distfield))
                          ->  XN Partition Loop  (cost=300000006.25..300000101.25 rows=1000 width=26)
                                ->  XN Seq Scan PartitionInfo of schemaname.table2  (cost=0.00..75.00 rows=1 width=8)
                                      Filter: ((((filterfieldview)::text = 'v1'::text) AND (distfield = 26)))
                                ->  XN S3 Query Scan table2  (cost=150000003.13..150000013.13 rows=1000 width=18)
                                      ->  S3 HashAggregate  (cost=150000003.13..150000003.13 rows=1000 width=18)
                                            ->  S3 Seq Scan schemaname.table2 location:"s3://s3location/TABLE2" format:PARQUET  (cost=0.00..150000000.00 rows=1251 width=18)
                                                  Filter: (((final_value)::text = 'T'::text) AND ((condition1)::text = 'A'::text) AND (condition2 IS NULL))
                          ->  XN Hash  (cost=350000086.26..350000086.26 rows=1000 width=26)
                                ->  XN Partition Loop  (cost=350000006.26..350000086.26 rows=1000 width=26)
                                      ->  XN Seq Scan PartitionInfo of schemaname.ref_darwin_lookup  (cost=0.00..60.00 rows=1 width=8)
                                            Filter: ((((filterfieldview)::text = 'v1'::text) AND (distfield = 26)))
                                      ->  XN S3 Query Scan ref_darwin_lookup  (cost=175000003.13..175000013.13 rows=1000 width=18)
                                            ->  S3 HashAggregate  (cost=175000003.13..175000003.13 rows=1000 width=18)
                                                  ->  S3 Seq Scan schemaname.ref_darwin_lookup location:"s3://s3location/LOOKUP/" format:PARQUET  (cost=0.00..175000000.00 rows=1251 width=18)
                                                        Filter: (((condition1)::text = 'TYPE2'::text) AND ((condition2)::text = 'Numeric'::text) AND ((condition3)::text = 'A'::text))
->  XN Hash  (cost=300000095.02..300000095.02 rows=1251 width=44)
  ->  XN Partition Loop  (cost=0.00..300000095.02 rows=1251 width=44)
        ->  XN Seq Scan PartitionInfo of schemaname.table3 volt_dt_8  (cost=0.00..70.00 rows=1 width=8)
              Filter: ((((filterfieldview)::text = 'v1'::text) AND (distfield = 31)))
        ->  XN S3 Query Scan volt_dt_8  (cost=0.00..150000012.51 rows=1251 width=36)
              ->  S3 Seq Scan schemaname.table3 volt_dt_8 location:"s3://s3location/TABLE3" format:PARQUET  (cost=0.00..150000000.00 rows=1251 width=36)
                    Filter: (((condition3)::text = 'T'::text) AND ((condition1)::text = 'A'::text) AND (condition2 IS NULL))
                                                   

when I monitor the execution of the query I see millions of rows processed at the step treating the TABLE1 where this table has only thousand of records...

0

There are 0 best solutions below