Teradata query to Calculate Start and End dates Based on a condition

38 Views Asked by At

I have a dataset for certain STS codes and I need to retain the Start and End Date When Status =03

I have the following Data

HAVE

  ID               CHGDATE          STS
  101              11/21/2022        00
  101              11/22/2022        03
  101              11/23/2022        03
  101              11/24/2022        03
  101              03/08/2023        04
  102              08/05/2022        03
  102              08/08/2022        04
  102              11/10/2022        03
  102              11/17/2022        00

WANT
   ID            STARTDATE           ENDDATE
  101           11/22/2022           03/08/2023
  102           08/05/2022           08/08/2022
  102           11/10/2022           11/17/2022

 
  • The ID 101 had STS=03 for the fist time on 11/22/2022, hence this should be my 'StartDate for this Record. This record changed its STS to other than 03 on 3/8/2023, therefore this should be EndDate

  • The ID 102 had STS=03 multiple times with two Start and End dates 8/5-8/8 and 11/10-11/17

     My query
    
      SELECT 
       ID,STS,CHGDATE AS STARTDATE,
       LEAD(CHGDATE) OVER (PARTITION BY ID ORDER BY CHGDATE) AS ENDDATE
    FROM 
      HAVE
      WHERE STS=03
    

This query works for ID 102 but not for ID 101. Is there a different way to achieve the desired result?

Thanks

2

There are 2 best solutions below

1
Andrew On

It won't work the way you're trying to do it. Your query is excluding all rows where sts <> '03'. I would split it into 2 derived tables, one for your start dates (sts = 03), and one for everything else (sts <> '03'). Then you can join those together, and use qualify to get only the relevant rows:

create volatile table vt_foo
(
id int,
changedate date,
sts varchar(2))
on commit preserve rows;
insert into vt_foo values (102, '2022-08-05','03');
insert into vt_foo values (102, '2022-08-08','04');
insert into vt_foo values (102, '2022-11-10','03');
insert into vt_foo values (102, '2022-11-17','00');

select
strt.id,
strt.changedate as startdate,
oth.changedate as enddate
from 
(
select
*
from
vt_foo
where
sts = '03')  strt
left join
(
select
*
from
vt_foo
where
sts <>  '03') oth
    on strt.id = oth.id
    and oth.changedate > strt.changedate
qualify row_number() over (partition by strt.id,strt.changedate order by oth.changedate) = 1
2
ValNik On

See example

select ID,min(chgdate)startdate,max(nextdate)enddate,min(STS) STS,group_num
from(
  select *
    ,sum(new_group)
        OVER (PARTITION BY ID ORDER BY CHGDATE
             ROWS_BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING -- @Fred
       ) group_num
  from(
    SELECT 
     ID,STS,CHGDATE,
     case when STS<>Lag(STS) OVER (PARTITION BY ID ORDER BY CHGDATE) then 1
     else 0 end new_group,
     LEAD(chgdate) OVER (PARTITION BY ID ORDER BY CHGDATE) nextdate
    FROM HAVE  
  )t
)t2
where sts='03'
group by id,group_num
order by id,group_num