Get previous X days of available data, skipping days without data

33 Views Asked by At

I have a table that has daily data and thousands of rows per day:

create table my_table(visitors_count int, dt date);

select setseed(.42);

insert into my_table 
select (random()*100)::int, 
       current_date+(-random()*14)::int
from generate_series(1,42000);

However, some days are not present due to holidays:

delete from my_table where dt = '2024-03-08';

I need a query that would check if the initial date has data, and if it does, retrieves the data for the previous 5 days.

Example: If I put in today's date 2024/3/12, I would need to

  1. Check if 2024/3/12 has data.

  2. If yes, retrieve data for March 7,8,9,10,11 and 12:

    2024-03-07
    2024-03-08
    2024-03-09
    2024-03-10
    2024-03-11
    2024-03-12
  3. However, if let's say the 8th is a holiday and there was no data, I would need data for 6,7,9,10,11,12:

    2024-03-06
    2024-03-07
    2024-03-09
    2024-03-10
    2024-03-11
    2024-03-12
2

There are 2 best solutions below

0
Zegarek On

Set up a window to order by date descending and use dense_rank() to assign the same numbers to the same dates - it will skip the empty ones. Then ask for only those up to 5 days back: demo

with data_by_days_back as (
  select *,dense_rank()over w1 as days_back 
  from my_table 
  where dt<='2024-03-12' --start date
  window w1 as (order by dt desc) )
select distinct dt, days_back --checking which dates got caught
from data_by_days_back
where days_back<=6 --number of non-empty days back from the start 
order by dt;
dt days_back
2024-03-06 6
2024-03-07 5
2024-03-09 4
2024-03-10 3
2024-03-11 2
2024-03-12 1
0
Soren On

Using a CTE, you can get the last 5 dates preceding your date of interest using limit 6 and ordering by the date. And only returning results when the date of interest exists in the table. Then select all data points from your table that match those 6 dates (if any): demo

with last5dates as (
    select distinct mydate
    from my_table mdt
    where mdt.mydate <= '2024-03-12'
    order by mdt.mydate desc
    limit 6 )    
select distinct mydate
from my_table 
inner join last5dates using (mydate)
order by mydate;
mydate
2024-03-06
2024-03-07
2024-03-09
2024-03-10
2024-03-11
2024-03-12