Understanding how a SQL recursive cte evaluates

39 Views Asked by At

I have the following CTE that I'd like to understand in depth a bit more (i.e., how it evaluates behind the scenes):

WITH RECURSIVE example (loc, a,b,c) AS (
    SELECT 'anchor', 1,2,3
    UNION ALL
    SELECT 'recursive', a*a, b*b, c*c FROM example WHERE c < 50
) SELECT * FROM example;

┌───────────┬───┬─────┬──────┐
│ loc       ┆ a ┆ b   ┆ c    │
╞═══════════╪═══╪═════╪══════╡
│ anchor    ┆ 1 ┆   2 ┆    3 │
│ recursive ┆ 1 ┆   4 ┆    9 │
│ recursive ┆ 1 ┆  16 ┆   81 │
└───────────┴───┴─────┴──────┘

Is the following a correct understanding of building the example (cte) table?

  1. It starts with the simple SELECT anchor, which gives us the starting result-set.
SELECT 'anchor', 1, 2, 3
┌───────────┬───┬─────┬──────┐
│ loc       ┆ a ┆ b   ┆ c    │
╞═══════════╪═══╪═════╪══════╡
│ anchor    ┆ 1 ┆   2 ┆    3 │
└───────────┴───┴─────┴──────┘
  1. Now it runs the recursive case and for the example table it will evaluate it iteratively, for the first row it has values of ('anchor', 1, 2, 3) and so the statement it now runs to generate the second item (first recursive case) is:
SELECT 'recursive' a*a, b*b, c*c FROM (VALUES ('anchor', 1,2,3)) AS example (loc,a,b,c)
WHERE c < 50
┌───────────┬───┬─────┬──────┐
│ loc       ┆ a ┆ b   ┆ c    │
╞═══════════╪═══╪═════╪══════╡
│ recursive ┆ 1 ┆   4 ┆    9 │
└───────────┴───┴─────┴──────┘
  1. Let's do the same thing again, now using the values from the current iteration:
SELECT 'recursive', a*a, b*b, c*c FROM (VALUES ('recursive', 1,4,9)) AS example (loc,a,b,c)
WHERE c < 50
┌───────────┬───┬─────┬──────┐
│ loc       ┆ a ┆ b   ┆ c    │
╞═══════════╪═══╪═════╪══════╡
│ recursive ┆ 1 ┆  16 ┆   81 │
└───────────┴───┴─────┴──────┘
  1. We do the same thing again. However, this time c will be more than 100, and so it will return no rows:
SELECT 'recursive', a*a, b*b, c*c FROM (VALUES ('recursive', 1,16,81)) AS example (loc,a,b,c)
WHERE c < 50
[empty]

Because we have no new results in the iteration and the cte stops when we have a fixpoint, we know that our cte is finished. Here is how it looks then in total, "iterating" with successive UNION ALLs instead of using the RECURSIVE keyword:

WITH example (loc, a,b,c) AS (
    SELECT 'anchor', 1, 2, 3
    UNION ALL
    SELECT 'recursive', a*a, b*b, c*c FROM (VALUES ('anchor',1,2,3)) example (loc,a,b,c) WHERE c < 50
    UNION ALL
    SELECT 'recursive', a*a, b*b, c*c FROM (VALUES ('recursive',   1,4,9)) example (loc,a,b,c) WHERE c < 50
    UNION ALL
    SELECT 'recursive', a*a, b*b, c*c FROM (VALUES ('recursive', 1,16,81)) example (loc,a,b,c) WHERE c < 50
) 
SELECT * FROM example

┌───────────┬───┬────┬────┐
│ loc       ┆ a ┆ b  ┆ c  │
╞═══════════╪═══╪════╪════╡
│ anchor    ┆ 1 ┆  2 ┆  3 │
│ recursive ┆ 1 ┆  4 ┆  9 │
│ recursive ┆ 1 ┆ 16 ┆ 81 │
└───────────┴───┴────┴────┘

I know this is a little bit 'of a stretch' how I wrote the cte out in long-form, but I was wondering if that seems like it is correct or accurate in my understanding of it.

0

There are 0 best solutions below