I am trying to create a column that represents how many days ago the most recent promotion started for each product. It should continue counting on when the promotion ended, until the next promotion starts.
For example, I want the following:
| product | date | in_promo | days_since_last_promo |
|---|---|---|---|
| 1 | 2023-10-01 | false | null |
| 1 | 2023-10-02 | false | null |
| 1 | 2023-10-03 | true | 0 |
| 1 | 2023-10-04 | true | 1 |
| 1 | 2023-10-05 | true | 2 |
| 1 | 2023-10-06 | false | 3 |
| 1 | 2023-10-07 | false | 4 |
| 1 | 2023-10-08 | true | 0 |
| 1 | 2023-10-09 | true | 1 |
| 1 | 2023-10-10 | false | 2 |
Particularly, I'm having trouble getting the right days_since_last_promo for these lines:
| product | date | in_promo | days_since_last_promo |
|---|---|---|---|
| 1 | 2023-10-06 | false | 3 |
| 1 | 2023-10-07 | false | 4 |
I've been puzzling with lags, row_number() and partition by's but I can't figure it out. Is this even possible in SQL?
I would say it's related to this post, but we're trying to achieve something slightly different.
I tried for example
select
product
, date
, in_promo
, row_number() over (partition by recipe_id, in_promo, seqnum_u - seqnum_uo
order by date_cet
) as days_since_last_promo
from (select p.*,
row_number() over (partition by product order by date) as seqnum_u,
row_number() over (partition by product, in_promo order by date) as seqnum_uo
from product_sales_data as p
)
But that will give me
| product | date | in_promo | days_since_last_promo |
|---|---|---|---|
| 1 | 2023-10-01 | false | 1 |
| 1 | 2023-10-02 | false | 2 |
| 1 | 2023-10-03 | true | 1 |
| 1 | 2023-10-04 | true | 2 |
| 1 | 2023-10-05 | true | 3 |
| 1 | 2023-10-06 | false | 1 |
| 1 | 2023-10-07 | false | 2 |
| 1 | 2023-10-08 | true | 1 |
| 1 | 2023-10-09 | true | 2 |
| 1 | 2023-10-10 | false | 1 |
i.e. restarting the row_number when in_promo=false.
Here a solution using ORACLE syntax but with standard analytic functions, assuming the start of the promotion is the first date in_promo for a sequence of rows ( in_promo* !in_promo+ ) (should be easier with MATCH_RECOGNIZE but would be ORACLE-only):