Is there an elegant way to do union after aggregation with different selection criteria in SQL

91 Views Asked by At

For example, I have the following sale table:

week sales
1 10
1 15
2 5
2 10
3 10
3 20
4 5
4 10
5 15
5 10

I want to have a SQL query like this:

SELECT '1-3' AS period, SUM(sales)
FROM sale
WHERE 1<=week AND week<=3
GROUP BY period

UNION ALL

SELECT '2-4' AS period, SUM(sales)
FROM sale
WHERE 2<=week AND week<=4
GROUP BY period

UNION ALL

SELECT '3-5' AS period, SUM(sales)
FROM sale
WHERE 3<=week AND week<=5
GROUP BY period

When the number of rolling windows is large, the above SQL query will be cumbersome. Is there a workaround?

I tried to think of using carterian product. However, it will generate some unneeded rows.

3

There are 3 best solutions below

0
Salman A On

A simple and efficient solution is to use window functions with rolling window frame clause. However, if there are missing weeks in the data (e.g. week 1, 3, 4) then frame will be calculated incorrectly or some expected data will be missing.

A complete solution is to add the missing weeks using some row generator technique (e.g. recursive cte) then use window functions to calculate sum over rolling window:

WITH rcte(week, maxval) AS (
    SELECT MIN(week), MAX(week) - 2
    FROM sale
    UNION ALL
    SELECT week + 1, maxval
    FROM rcte
    WHERE week + 1 <= maxval
), cte AS (
    SELECT week, sales
    FROM sale
    UNION ALL
    SELECT week, 0
    FROM rcte
)
SELECT
    CONCAT(week, '-', week + 2) AS period,
    SUM(SUM(sales)) OVER (ORDER BY week ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING) AS total
FROM cte
GROUP BY week
ORDER BY week

This approach should work in all RDBMS that support recursive CTEs and window functions. Some RDBMs have built-in row generator functions.

DB<>Fiddle

0
Søren Kongstad On

If you have analytical window functions available its quite easy,

Assuming salestable contains the data

SELECT
    [week]
   ,LAG([week], 2) OVER (ORDER BY [week]) pweek
   ,SUM(SUM(sales)) OVER (ORDER BY [week] ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) sales

FROM salestable
GROUP BY [week]

this will give you sums of three running months, with pweek containing the startweek and sales the value

If there are less than 2 preceding weeks, pweek will be null.

+------+-------+-------+
| week | pweek | sales |
+------+-------+-------+
|    1 | null  |    25 |
|    2 | null  |    40 |
|    3 | 1     |    70 |
|    4 | 2     |    60 |
|    5 | 3     |    70 |
+------+-------+-------+

So I wrap this in a new query, that filters out nulls in pweek, and formats the range:

SELECT
    CAST(pweek AS VARCHAR(10)) + '-' + CAST([week] AS VARCHAR(10)) period
   ,sales
FROM (SELECT
        [week]
       ,LAG([week], 2) OVER (ORDER BY [week]) pweek
       ,SUM(SUM(sales)) OVER (ORDER BY [week] ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) sales

    FROM salestable
    GROUP BY [week]) a
WHERE a.pweek IS NOT NULL
AND [week] - a.pweek = 2


+-------+-------+
| period| sales |
+-------+-------+
| 1-3   |    70 |
| 2-4   |    60 |
| 3-5   |    70 |
+-------+-------+

the final condition "week-pweek=2" is to handle missing weeks.

If you do not have data for eg week 4, then this query would give us the range 2-5, now it will just ignore it.

To fix this I would fill out the weeks, ensuring that if I have no data in a week, there will be a zero sales. But this is beyond the scope of this.

SQL Fiddle is here

0
d r On

One option is to use subquery defining periods (using LEVEL Connect By) and get the sums of periods:

WITH    --  Sample Data
    sales (WEEK, SALE) AS
        (   Select  1,  10  From Dual   Union All
            Select  1,  15  From Dual   Union All
            Select  2,  5   From Dual   Union All
            Select  2,  10  From Dual   Union All
            Select  3,  10  From Dual   Union All
            Select  3,  20  From Dual   Union All
            Select  4,  5   From Dual   Union All
            Select  4,  10  From Dual   Union All
            Select  5,  15  From Dual   Union All
            Select  5,  10  From Dual 
        )
Select  s.PERIOD, 
        Nvl((Select Sum(SALE) From sales Where WEEK Between s.FROM_WEEK And s.TILL_WEEK), 0) "PERIOD_SALE"
From    ( Select    LEVEL "RN", LEVEL "FROM_WEEK", LEVEL + 2 "TILL_WEEK", To_Char(LEVEL) || '-' || To_Char(LEVEL + 2) "PERIOD"
          From  dual
          Connect By LEVEL <= (Select Max(Week) - 2 From sales)
          Group By LEVEL
          Order By LEVEL
        ) s
--  R e s u l t :
--  PERIOD   PERIOD_SALE
--  -------- -----------
--  1-3               70
--  2-4               60
--  3-5               70