I have test data like the following:
DROP TABLE IF EXISTS #test;
CREATE TABLE #test (id INT, start_date DATE, end_date DATE)
INSERT INTO
#test
VALUES
(1, '2023-01-01', '2024-01-01'),
(1, '2023-05-01', '2024-07-01'),
(1, '2025-01-01', '2026-01-01');
SELECT *
FROM #test;
I am trying to create integer groups that check if the prior row connects with the current. If it doesn't connect, I want to increment the integer to notate that its a separate group.
I have tried below to check if the current row connects with the prior based on if the current period start_date comes before the prior periods end date. this would signal that the time periods connect and be treated as the same group.
WITH connects AS (
SELECT
id,
CASE
WHEN LAG(id, 1) OVER (PARTITION BY id ORDER BY start_date) = id AND (
start_date <= LAG(end_date, 1) OVER (PARTITION BY id ORDER BY start_date)
OR
end_date <= LAG(end_date, 1) OVER (PARTITION BY id ORDER BY start_date)
) THEN 1
WHEN LAG(id, 1) OVER (PARTITION BY id ORDER BY start_date) IS NULL THEN 1
ELSE 0
END AS connects_flag
FROM
#test
)
SELECT
*
FROM
connects;
Desired outcome:
| id | start_date | end_date | grp |
|---|---|---|---|
| 1 | 2023-01-01 | 2024-01-01 | 1 |
| 1 | 2023-05-01 | 2024-07-01 | 1 |
| 1 | 2025-01-01 | 2026-01-01 | 2 |
Here's one way to do it to with cumulative aggregation:
In the common table expression I've used a combination of
lagandiifto create a column that will contain1whenever the previous row's end date is either missing or smaller than the current row's start date, and0otherwise (so 0 when the previous row overlaps with the current, 1 if it doesn't).Then all that's left to do is to commulatively sum the values of that column.
I've also added some more rows to your sample table:
Results:
You can see a live demo on db<>fiddle