How to get first value conditionally in Snowflake?

Question

How to get first value conditionally in Snowflake?

44 Views Asked by kimi At 04 March 2024 at 16:37

I have events table in Snowflake, one row per event. status can be failed that implies error or completed that implies error resolution. For each customer_id, I want to find two timestamps resolved_at and first_failed_at and ultimately calculate the error resolution in days between these two timestamps. How to achieve this?

I find the resolved_at like this but couldn't find first_failed_at. Note that there can be multiple events failed consecutively.

with windowed as (
    select 
        *,
        lag(status) over (partition by customer_id order by created_at) as prev_status,
        lead(status) over (partition by customer_id order by created_at) as next_status,
        lead(created_at) over (partition by customer_id order by created_at) as next_created_at
    from events
)

select 
    event_id,
    customer_id,
    status,
    created_at,
    case 
        when status = 'failed' and next_status = 'completed' then next_created_at
        else null
    end as resolved_at
from windowed

Here is the example data and two additional desired columns:

+--------------------------------------+-------------+-----------+---------------------+---------------------+---------------------+
|               event_id               | customer_id |  status   |     created_at      |     resolved_at     |   first_failed_at   |
+--------------------------------------+-------------+-----------+---------------------+---------------------+---------------------+
| e7921c9c-aa8f-49bb-83fa-31fa37d629ee | a           | completed | 2024-01-16 16:23:31 |                     |                     |
| 1d6a5bdf-e254-441d-9574-821844dc6c05 | a           | failed    | 2024-01-17 19:17:33 |                     | 2024-01-17 19:17:33 |
| 1a06783a-2ca4-4f68-80f0-7e7c84ac57fd | a           | failed    | 2024-01-22 18:07:54 |                     | 2024-01-17 19:17:33 |
| 61ecc8da-1e7f-4b19-92f1-fda93661a3af | a           | failed    | 2024-01-23 17:54:12 | 2024-01-24 19:34:48 | 2024-01-17 19:17:33 |
| 068e4220-bd57-410a-85f3-1fbb18f86053 | a           | completed | 2024-01-24 19:34:48 |                     |                     |
| 4ad30d16-ff13-48f3-b315-8a588d869c08 | a           | completed | 2024-01-25 17:30:54 |                     |                     |
| 82f8c57a-0e80-4ea6-96ad-3b578e5ec68a | a           | failed    | 2024-01-27 17:45:11 |                     | 2024-01-27 17:45:11 |
| 17d736d6-40de-4440-a220-e4a0612acb71 | a           | failed    | 2024-01-28 17:51:15 | 2024-01-29 17:23:31 | 2024-01-27 17:45:11 |
| be657472-69d0-4cd5-82e0-b33434ff213a | a           | completed | 2024-01-29 17:23:31 |                     |                     |
| 3bce3901-2639-489d-8280-04518a253b33 | b           | failed    | 2023-12-15 15:06:32 |                     | 2023-12-15 15:06:32 |
| 99b1bd8b-0ac3-4868-b687-73fc93ef9ddf | b           | failed    | 2023-12-16 15:07:21 | 2023-12-17 15:10:42 | 2023-12-15 15:06:32 |
| 406b42e6-63c6-402e-9519-87240d007315 | b           | completed | 2023-12-17 15:10:42 |                     |                     |
| 946e4c25-db5a-44e4-8c29-d7144ff5b03c | c           | failed    | 2023-12-15 15:06:10 |                     | 2023-12-15 15:06:10 |
| ba0c96d2-4b68-43e5-99a6-0f30530dd1ad | c           | failed    | 2023-12-16 15:06:32 |                     | 2023-12-15 15:06:10 |
+--------------------------------------+-------------+-----------+---------------------+---------------------+---------------------+

Original Q&A

There are 1 best solutions below

**ValNik** · Answer 1 · 2024-03-04T20:25:28.960000

See example for SQL Server. I think, this is applicable for snowflake.

Set number for each group or rows, where 0,1 or more 'failed' rows and 0 or 1 'completed' row. Group number is number of 'completed' row for customer_id, ordered by created_ad.
Calculate datetime for 'completed' row for group (null if no sach row), and calculate first 'failed' row.

with windowed as (
 select *,
  lag(status) over (partition by customer_id order by created_at) as prev_status,
  lead(status) over (partition by customer_id order by created_at) as next_status,
  lead(created_at) over (partition by customer_id order by created_at) as next_created_at,
  coalesce(sum(case when status='completed' then 1 else 0 end) 
     over (partition by customer_id order by created_at 
           rows between unbounded preceding and 1 preceding),0) as groupNum
 from events
)

select 
    event_id,
    customer_id,
    status,
    created_at,
    groupNum,
    max(case when status='completed' then created_at end)
        over(partition by customer_id,groupNum) resolved_at,
    min(case when status='failed' then created_at end)
        over(partition by customer_id,groupNum) first_failed_at
from windowed

Query output

event_id	customer_id	status	created_at	groupNum	resolved_at	first_failed_at
e7921c9c-aa8f-49bb-83fa-31fa37d629ee	a	completed	2024-01-16 16:23:31.000	0	2024-01-16 16:23:31.000	null
1d6a5bdf-e254-441d-9574-821844dc6c05	a	failed	2024-01-17 19:17:33.000	1	2024-01-24 19:34:48.000	2024-01-17 19:17:33.000
1a06783a-2ca4-4f68-80f0-7e7c84ac57fd	a	failed	2024-01-22 18:07:54.000	1	2024-01-24 19:34:48.000	2024-01-17 19:17:33.000
61ecc8da-1e7f-4b19-92f1-fda93661a3af	a	failed	2024-01-23 17:54:12.000	1	2024-01-24 19:34:48.000	2024-01-17 19:17:33.000
068e4220-bd57-410a-85f3-1fbb18f86053	a	completed	2024-01-24 19:34:48.000	1	2024-01-24 19:34:48.000	2024-01-17 19:17:33.000
4ad30d16-ff13-48f3-b315-8a588d869c08	a	completed	2024-01-25 17:30:54.000	2	2024-01-25 17:30:54.000	null
82f8c57a-0e80-4ea6-96ad-3b578e5ec68a	a	failed	2024-01-27 17:45:11.000	3	2024-01-29 17:23:31.000	2024-01-27 17:45:11.000
17d736d6-40de-4440-a220-e4a0612acb71	a	failed	2024-01-28 17:51:15.000	3	2024-01-29 17:23:31.000	2024-01-27 17:45:11.000
be657472-69d0-4cd5-82e0-b33434ff213a	a	completed	2024-01-29 17:23:31.000	3	2024-01-29 17:23:31.000	2024-01-27 17:45:11.000
3bce3901-2639-489d-8280-04518a253b33	b	failed	2023-12-15 15:06:32.000	0	2023-12-17 15:10:42.000	2023-12-15 15:06:32.000
99b1bd8b-0ac3-4868-b687-73fc93ef9ddf	b	failed	2023-12-16 15:07:21.000	0	2023-12-17 15:10:42.000	2023-12-15 15:06:32.000
406b42e6-63c6-402e-9519-87240d007315	b	completed	2023-12-17 15:10:42.000	0	2023-12-17 15:10:42.000	2023-12-15 15:06:32.000
946e4c25-db5a-44e4-8c29-d7144ff5b03c	c	failed	2023-12-15 15:06:10.000	0	null	2023-12-15 15:06:10.000
ba0c96d2-4b68-43e5-99a6-0f30530dd1ad	c	failed	2023-12-16 15:06:32.000	0	null	2023-12-15 15:06:10.000

Fiddle

How to get first value conditionally in Snowflake?

There are 1 best solutions below

Related Questions in SQL

Related Questions in SNOWFLAKE-CLOUD-DATA-PLATFORM

Related Questions in WINDOW-FUNCTIONS

Trending Questions

Popular # Hahtags

Popular Questions