Correlated subquery pattern is not supported due to internal error - where not exists correlated subquery

3.4k Views Asked by At

I have a query that is giving me the error above. My code is the following:

SELECT *,
      dense_rank() OVER (PARTITION BY email
                         ORDER BY priority_score,
                                  comp) AS r
FROM main_query
WHERE NOT EXISTS
   (SELECT name,
           event,
           email,
           report_date
    FROM gm
    WHERE  gm.name= main_query.name
      AND gm.event= main_query.event
      AND gm.email = main_query.email
      AND gm.report_date >= (CURRENT_DATE - 25)::date)
ORDER BY priority_score ASC

One solution that I saw to overcome these types of errors was to be able to transform correlated subqueries in queries not correlated (sherlock). Therefore, I am searching for other ways of using the where not exists statement but without a correlated subquery, i.e., calling the main_query table inside the subquery ((...) from gm left join main_query on(...)). Does anyone know if this is possible and how to do it?

Any advice is more than welcome and thanks a lot in advance!

2

There are 2 best solutions below

1
Gordon Linoff On BEST ANSWER

Does a LEFT JOIN version work?

SELECT mq.*,
      dense_rank() OVER (PARTITION BY mq.email
                         ORDER BY mq.priority_score, comp
                        ) AS r
FROM main_query mq LEFT JOIN
     gm
     ON gm.name = mq.name AND
        gm.event= mq.event AND
        gm.email = mq.email AND
        gm.report_date >= (CURRENT_DATE - 25)::date)
WHERE gm.name IS NULL  
ORDER BY priority_score ASC;

If that doesn't work, it should work like this:

SELECT mq.*,
      dense_rank() OVER (PARTITION BY mq.email
                         ORDER BY mq.priority_score, comp
                        ) AS r
FROM main_query mq LEFT JOIN
     (SELECT gm.*
      FROM gm
      WHERE gm.report_date >= (CURRENT_DATE - 25)::date)
     ) gm
     ON gm.name = mq.name AND
        gm.event= mq.event AND
        gm.email = mq.email
WHERE gm.name IS NULL  
ORDER BY priority_score ASC
0
Vince Hill On

From another post and my own experimentation, it seems to be the issue is the co-related subquery is disallowed by Redshift due to the potential of it returning more than one row, and this triggers the not very helpful error message "correlated subquery pattern is not supported due to internal error".

There is some AWS documentation on the types of co-related subqueries Redshift doesn't support.

I found I could not get the 'not exists' approach to work, but by using a count comparision I could achieve the same result, which has the advantage of requiring less reworking of the code than swapping to a 'left join' approach.

select *
from stl_ddltext a
where 1 > (
  select count(*) 
    from logs.mytable_stl_ddltext b 
   where a.userid    = b.userid
     and a.xid       = b.xid
     and a.pid       = b.pid
     and a.starttime = b.starttime
     and a.endtime   = b.endtime
     and a.text::varchar(200)      = b.text::varchar(200)
);