Postgres query planner filter order affected by using now on sequential scan

510 Views Asked by At

I have a query whereby, when enabling sequential scanning on the postgres database and using now() in the where clause, the query planner will prefer a sequential scan of the table and then filter:

    EXPLAIN ANALYZE
    SELECT
        action_id
    FROM
        events
    WHERE
        started_at IS NULL
        AND deleted_at IS NULL
        AND due_at < now()
        AND due_at > now() - interval '14 days'
    LIMIT 1
    FOR UPDATE
        SKIP LOCKED;

Example: https://explain.depesz.com/s/xLlM

Query with enable_seqscan db parameter set to false: https://explain.depesz.com/s/e8Fe

I am looking to help the query optimiser use the index.

I suspect the fact that the number of rows that match started_at is null and deleted_at is null filter roughly makes up 13% of the total table rows (and due_at column is completed unique and uncorrelated) means that the query optimiser is pessimistic about finding a match quickly enough using the index but in fact that's not the case.

EDIT: For the time being I have restructured the query like so:

        SELECT 
            id,
            previous_event_id,
            due_at,
            action_id,
            subscription_url 
        FROM (
            SELECT 
                id, 
                previous_event_id, 
                due_at, 
                action_id, 
                subscription_url from events 
                WHERE 
                    started_at is null 
                    AND deleted_at is null
                LIMIT 100
                FOR update SKIP LOCKED
        ) events_to_pick_from
        WHERE EXISTS (
            SELECT 1
                FROM events
            WHERE
                events_to_pick_from.due_at < now()
                AND events_to_pick_from.due_at > now() - interval '14 days'
                AND events.action_id = events_to_pick_from.action_id
        )
        LIMIT 1
        FOR UPDATE SKIP LOCKED;

https://explain.depesz.com/s/fz2h

But would be grateful for other suggestions

1

There are 1 best solutions below

13
Laurenz Albe On

Both queries have the same execution plan.

The difference is that the query with the constants happens to find a row that matches the condition quickly, after reading only 27 rows from the table.

The query using now() does not find a single matching row in the table (actual rows=0), but it has to scan all 7 million rows before it knows for sure.

An index on due_at should improve the performance considerably.