How to get first value conditionally in Snowflake?

44 Views Asked by At

I have events table in Snowflake, one row per event. status can be failed that implies error or completed that implies error resolution.
For each customer_id, I want to find two timestamps resolved_at and first_failed_at and ultimately calculate the error resolution in days between these two timestamps. How to achieve this?

I find the resolved_at like this but couldn't find first_failed_at. Note that there can be multiple events failed consecutively.

with windowed as (
    select 
        *,
        lag(status) over (partition by customer_id order by created_at) as prev_status,
        lead(status) over (partition by customer_id order by created_at) as next_status,
        lead(created_at) over (partition by customer_id order by created_at) as next_created_at
    from events
)

select 
    event_id,
    customer_id,
    status,
    created_at,
    case 
        when status = 'failed' and next_status = 'completed' then next_created_at
        else null
    end as resolved_at
from windowed

Here is the example data and two additional desired columns:

+--------------------------------------+-------------+-----------+---------------------+---------------------+---------------------+
|               event_id               | customer_id |  status   |     created_at      |     resolved_at     |   first_failed_at   |
+--------------------------------------+-------------+-----------+---------------------+---------------------+---------------------+
| e7921c9c-aa8f-49bb-83fa-31fa37d629ee | a           | completed | 2024-01-16 16:23:31 |                     |                     |
| 1d6a5bdf-e254-441d-9574-821844dc6c05 | a           | failed    | 2024-01-17 19:17:33 |                     | 2024-01-17 19:17:33 |
| 1a06783a-2ca4-4f68-80f0-7e7c84ac57fd | a           | failed    | 2024-01-22 18:07:54 |                     | 2024-01-17 19:17:33 |
| 61ecc8da-1e7f-4b19-92f1-fda93661a3af | a           | failed    | 2024-01-23 17:54:12 | 2024-01-24 19:34:48 | 2024-01-17 19:17:33 |
| 068e4220-bd57-410a-85f3-1fbb18f86053 | a           | completed | 2024-01-24 19:34:48 |                     |                     |
| 4ad30d16-ff13-48f3-b315-8a588d869c08 | a           | completed | 2024-01-25 17:30:54 |                     |                     |
| 82f8c57a-0e80-4ea6-96ad-3b578e5ec68a | a           | failed    | 2024-01-27 17:45:11 |                     | 2024-01-27 17:45:11 |
| 17d736d6-40de-4440-a220-e4a0612acb71 | a           | failed    | 2024-01-28 17:51:15 | 2024-01-29 17:23:31 | 2024-01-27 17:45:11 |
| be657472-69d0-4cd5-82e0-b33434ff213a | a           | completed | 2024-01-29 17:23:31 |                     |                     |
| 3bce3901-2639-489d-8280-04518a253b33 | b           | failed    | 2023-12-15 15:06:32 |                     | 2023-12-15 15:06:32 |
| 99b1bd8b-0ac3-4868-b687-73fc93ef9ddf | b           | failed    | 2023-12-16 15:07:21 | 2023-12-17 15:10:42 | 2023-12-15 15:06:32 |
| 406b42e6-63c6-402e-9519-87240d007315 | b           | completed | 2023-12-17 15:10:42 |                     |                     |
| 946e4c25-db5a-44e4-8c29-d7144ff5b03c | c           | failed    | 2023-12-15 15:06:10 |                     | 2023-12-15 15:06:10 |
| ba0c96d2-4b68-43e5-99a6-0f30530dd1ad | c           | failed    | 2023-12-16 15:06:32 |                     | 2023-12-15 15:06:10 |
+--------------------------------------+-------------+-----------+---------------------+---------------------+---------------------+
1

There are 1 best solutions below

2
ValNik On

See example for SQL Server. I think, this is applicable for snowflake.

  1. Set number for each group or rows, where 0,1 or more 'failed' rows and 0 or 1 'completed' row. Group number is number of 'completed' row for customer_id, ordered by created_ad.

  2. Calculate datetime for 'completed' row for group (null if no sach row), and calculate first 'failed' row.

with windowed as (
 select *,
  lag(status) over (partition by customer_id order by created_at) as prev_status,
  lead(status) over (partition by customer_id order by created_at) as next_status,
  lead(created_at) over (partition by customer_id order by created_at) as next_created_at,
  coalesce(sum(case when status='completed' then 1 else 0 end) 
     over (partition by customer_id order by created_at 
           rows between unbounded preceding and 1 preceding),0) as groupNum
 from events
)

select 
    event_id,
    customer_id,
    status,
    created_at,
    groupNum,
    max(case when status='completed' then created_at end)
        over(partition by customer_id,groupNum) resolved_at,
    min(case when status='failed' then created_at end)
        over(partition by customer_id,groupNum) first_failed_at
from windowed

Query output

event_id customer_id status created_at groupNum resolved_at first_failed_at
e7921c9c-aa8f-49bb-83fa-31fa37d629ee a completed 2024-01-16 16:23:31.000 0 2024-01-16 16:23:31.000 null
1d6a5bdf-e254-441d-9574-821844dc6c05 a failed 2024-01-17 19:17:33.000 1 2024-01-24 19:34:48.000 2024-01-17 19:17:33.000
1a06783a-2ca4-4f68-80f0-7e7c84ac57fd a failed 2024-01-22 18:07:54.000 1 2024-01-24 19:34:48.000 2024-01-17 19:17:33.000
61ecc8da-1e7f-4b19-92f1-fda93661a3af a failed 2024-01-23 17:54:12.000 1 2024-01-24 19:34:48.000 2024-01-17 19:17:33.000
068e4220-bd57-410a-85f3-1fbb18f86053 a completed 2024-01-24 19:34:48.000 1 2024-01-24 19:34:48.000 2024-01-17 19:17:33.000
4ad30d16-ff13-48f3-b315-8a588d869c08 a completed 2024-01-25 17:30:54.000 2 2024-01-25 17:30:54.000 null
82f8c57a-0e80-4ea6-96ad-3b578e5ec68a a failed 2024-01-27 17:45:11.000 3 2024-01-29 17:23:31.000 2024-01-27 17:45:11.000
17d736d6-40de-4440-a220-e4a0612acb71 a failed 2024-01-28 17:51:15.000 3 2024-01-29 17:23:31.000 2024-01-27 17:45:11.000
be657472-69d0-4cd5-82e0-b33434ff213a a completed 2024-01-29 17:23:31.000 3 2024-01-29 17:23:31.000 2024-01-27 17:45:11.000
3bce3901-2639-489d-8280-04518a253b33 b failed 2023-12-15 15:06:32.000 0 2023-12-17 15:10:42.000 2023-12-15 15:06:32.000
99b1bd8b-0ac3-4868-b687-73fc93ef9ddf b failed 2023-12-16 15:07:21.000 0 2023-12-17 15:10:42.000 2023-12-15 15:06:32.000
406b42e6-63c6-402e-9519-87240d007315 b completed 2023-12-17 15:10:42.000 0 2023-12-17 15:10:42.000 2023-12-15 15:06:32.000
946e4c25-db5a-44e4-8c29-d7144ff5b03c c failed 2023-12-15 15:06:10.000 0 null 2023-12-15 15:06:10.000
ba0c96d2-4b68-43e5-99a6-0f30530dd1ad c failed 2023-12-16 15:06:32.000 0 null 2023-12-15 15:06:10.000

Fiddle