SQL query for the total number of employees for the selected period in the SCD2 table

62 Views Asked by At

I would need a help for the sql syntax which would give me the total number of employees for the selected period in the SCD2 table. Examples are provided below - the thing is that i get the inconsistent results for the totals, i guess due to the fact that i'm using "FlagIsCurrent" variable. Since the variable "ValidFrom" and "ValidTo" refer to the period when somebody worked in the specific store or quit, would it be possible to get the consistent results with those variables? "FlagIsCurrent" only refers to the current state of affairs and i want to go "into past".

Any ideas?

Table looks similar like this:

EmployeeID StoreID EmploymentDate TerminationDate ValidFrom ValidTo FlagIsCurrent LoadDateTime
e01 5 2010-10-01 1900-01-01 2010-10-01 9999-12-31 1 2023-01-05 23:00:27.543
e02 4 2022-10-10 1900-01-01 2022-10-10 2022-12-10 0 2023-01-01 23:00:27.543
e02 5 2022-10-10 1900-01-01 2022-12-13 9999-12-31 1 2023-01-05 23:00:27.543
e03 3 2023-02-20 1900-01-01 2023-02-20 9999-12-31 1 2023-01-05 23:00:27.543
e04 3 2022-08-25 1900-01-01 2022-08-25 2022-12-04 0 2023-01-01 23:00:27.543
e04 3 2022-08-25 2022-12-05 2022-12-05 9999-12-31 1 2023-01-05 23:00:27.543

Hires for the selected period (works ok):


SELECT COUNT (DISTINCT [EmployeeID])
FROM [import].[hr].[Employees]
WHERE [EmploymentDate]>=MIN(Calendar[Date]) AND [EmploymentDate]<=MAX(Calendar[Date])

...or for the specific month:

SELECT COUNT (DISTINCT [EmployeeID])
FROM [import].[hr].[Employees]
WHERE [EmploymentDate]>='2023-02-01' AND [EmploymentDate]<='2023-02-28'

Terminations for the selected period (works ok):

SELECT COUNT (DISTINCT [EmployeeID])
FROM [import].[hr].[Employees]
WHERE [TerminationDate]<>'1900-01-01' AND [TerminationDate]>=MIN(Calendar[Date]) AND [TerminationDate]<=MAX(Calendar[Date])

...or for the specific month:

SELECT COUNT (DISTINCT [EmployeeID])
FROM [import].[hr].[Employees]
WHERE [TerminationDate]<>'1900-01-01' AND [TerminationDate]>='2023-02-01' AND [TerminationDate]<='2023-02-28'

Totals for the selected period (inconsistent results):

SELECT COUNT (DISTINCT [EmployeeID])
FROM [import].[hr].[Employees]
WHERE [FlagIsCurrent]='1'
AND
(
([TerminationDate]='1900-01-01' AND [EmploymentDate]<=MAX(Calendar[Date])
OR
([EmploymentDate]<=MAX(Calendar[Date]) AND [TerminationDate]>=MIN(Calendar[Date])
))

...or for the specific month:

SELECT COUNT (DISTINCT [EmployeeID])
FROM [import].[hr].[Employees]
WHERE [FlagIsCurrent]='1'
AND
(
([TerminationDate]='1900-01-01' AND [EmploymentDate]<='2023-02-28'
OR
([EmploymentDate]<='2023-02-28') AND [TerminationDate]>='2023-02-01'
))

UPDATE

Is it safe to assume that the total number of employees for the selected period would be the following:

SELECT COUNT ([EmployeeID]) FROM [import].[hr].[Employees] WHERE [TerminationDate] IS '1900-01-01' AND [ValidFrom]<=MAX(Calendar[Date]) AND [ValidTo]>=MIN(Calendar[Date])

...or for the last February that would be:

SELECT COUNT ([EmployeeID]) FROM [import].[hr].[Employees] WHERE [TerminationDate] IS '1900-01-01' AND [ValidFrom]<='2023-02-28' AND [ValidTo]>='2023-02-01'
0

There are 0 best solutions below