I'm trying to solve a problem in the following (simplified) dataset:
| Name | Date | Workday | Calenderday | Leave |
|---|---|---|---|---|
| PersonA | 2023-01-01 | 0 | 1 | NULL |
| PersonA | 2023-01-07 | 0 | 1 | NULL |
| PersonA | 2023-01-08 | 0 | 1 | NULL |
| PersonA | 2023-01-13 | 1 | 1 | Sick |
| PersonA | 2023-01-14 | 0 | 1 | NULL |
| PersonA | 2023-01-15 | 0 | 1 | NULL |
| PersonA | 2023-01-16 | 1 | 1 | Sick |
| PersonA | 2023-01-20 | 1 | 1 | Holiday |
| PersonA | 2023-01-21 | 0 | 1 | NULL |
| PersonA | 2023-01-22 | 0 | 1 | NULL |
| PersonA | 2023-01-23 | 1 | 1 | Holiday |
| PersonB | 2023-01-01 | 0 | 1 | NULL |
| PersonB | 2023-01-02 | 1 | 1 | Sick |
| PersonB | 2023-01-03 | 1 | 1 | Sick |
Where the lines with NULL in [Leave] is weekend.
What I want is a result looking like this:
| Name | Leave | PeriodStartDate | PeriodEndDate | Workdays | Weekdays |
|---|---|---|---|---|---|
| PersonA | Sick | 2023-01-13 | 2023-01-16 | 2 | 4 |
| PersonA | Holiday | 2023-01-20 | 2023-01-23 | 2 | 4 |
| PersonB | Sick | 2023-01-02 | 2023-01-03 | 2 | 2 |
where the difference between [Workdays] and [Weekdays] is that weekdays also counts the weekend.
What I have been trying is to first make a row (in two different ways)
ROW_NUMBER() OVER (PARTITION BY \[Name\] ORDER BY \[Date\]) as RowNo1
ROW_NUMBER() OVER (PARTITION BY \[Name\], \[Leave\] ORDER BY \[Date\]) as RowNo2
and after that to make a period base date:
DATEADD(DAY, 0 - \[RowNo1\], Date) as PeriodBaseDate1
,DATEADD(DAY, 0 - \[RowNo2\], \[Date\]) as PeriodBaseDate2
and after that do something like this:
MIN(\[Date\]) as PeriodStartDate
,MAX(\[Dato\]) as PeriodEndDate
,SUM(\[Calenderday\]) as Weekdays
,SUM(\[Workday\]) as Workdays
GROUP BY \[PeriodBaseDate (1 or 2?)\], \[Leave\], \[Name\]
But whatever I do I can't seem to get it to count the weekends in the periods.
It doesn't have to include my try with the RowNo, PeriodBaseDate etc.
As we don't have your actual full solutions, I've provided a full working one. I firstly use
LAST_VALUEto have all the rows have a value for theirLeave(provided there was a non-NULLvalue previously).Once I do that, you have a gaps and island problem, and can aggregate based on that.
I assume you are using SQL Server 2022, the latest version of SQL Server at the time of writing, as no version details are given and thus have access to the
IGNORE NULLSsyntax.