Calculating Rolling Average on SQL

298 Views Asked by At

I need to write a Query that gives me January 31's rolling 3 day average of total transaction amount processed per day

I used the query below, but I'm having problem to retrieve information, because it keeps giving this error:

Query Error: error: column "sum_pday.sumamount" must appear in the GROUP BY clause or be used in an aggregate function

select 
    Sum_pDay.day
    , Sum_pDay.Month
    , Sum_pDay.Year
     ,  avg (Sum_pDay.SumAmount) over (order by Sum_pDay.day rows between 2 preceding and current row) as Avg
from (
  select 
    extract (year from transaction_time) as Year
      , extract (month from transaction_time) as Month
      , extract (day from transaction_time) as day
      , sum (transaction_amount) as SumAmount
     from
        transactions
     group by Year, Month, day
     order by 3) as Sum_pDay
group by 1,2,3
3

There are 3 best solutions below

1
nbk On

The error message is clear when you use a GROUP BY every column must be in the group by or have a aggregation function.

In your case you can SUM the amount as it is already in the group by

better is variant 2 as you really don't need the aggregation

CREATE TABLe transactions (transaction_time date,transaction_amount decimal(10,2))
CREATE TABLE
select 
    Sum_pDay.day
    , Sum_pDay.Month
    , Sum_pDay.Year
     ,  avg (SUM(Sum_pDay.SumAmount)) over (order by Sum_pDay.day rows between 2 preceding and current row) as Avg
from (
  select 
    extract (year from transaction_time) as Year
      , extract (month from transaction_time) as Month
      , extract (day from transaction_time) as day
      , sum (transaction_amount) as SumAmount
     from
        transactions
     group by Year, Month, day
     order by 3) as Sum_pDay
group by 1,2,3

day month year avg
SELECT 0
select 
    Sum_pDay.day
    , Sum_pDay.Month
    , Sum_pDay.Year
     ,  avg (Sum_pDay.SumAmount) over (order by Sum_pDay.day rows between 2 preceding and current row) as Avg
from (
  select 
    extract (year from transaction_time) as Year
      , extract (month from transaction_time) as Month
      , extract (day from transaction_time) as day
      , sum (transaction_amount) as SumAmount
     from
        transactions
     group by Year, Month, day
     order by 3) as Sum_pDay

day month year avg
SELECT 0

fiddle

0
Muhammad Sarmad On

The reason for your problem is that you're attempting to utilise a column Sum_pDay.SumAmount in the window function's AVG that is neither used in an aggregate function nor included in the GROUP BY clause. You can alter your query as follows to determine the rolling 3-day average of the daily transaction amount processed:

SELECT
    day,
    month,
    year,
    AVG(SumAmount) OVER (
        ORDER BY year, month, day
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS avg_transaction_amount
FROM (
    SELECT
        EXTRACT(YEAR FROM transaction_time) AS year,
        EXTRACT(MONTH FROM transaction_time) AS month,
        EXTRACT(DAY FROM transaction_time) AS day,
        SUM(transaction_amount) AS SumAmount
    FROM transactions
    GROUP BY year, month, day
) AS Sum_pDay
ORDER BY year, month, day;

Since you already completed the aggregation in the subquery Sum_pDay in this query, I've eliminated the GROUP BY from the outer query. In this manner, you may perform a straight calculation of the rolling 3-day average over the sorted result set of Sum_pDay. Hope this fixes the issue you were having.

1
Erwin Brandstetter On

Only aggregate once to get the daily sum of transactions, and the average daily sum over the past 3 days. You can use a subquery for that, but you don't have to:

SELECT transaction_time::date AS transaction_date
     , sum(transaction_amount) AS sum_day
     , avg(sum(transaction_amount)) OVER (ORDER BY transaction_time::date
                                          ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS avg_3days
FROM   transactions
GROUP  BY 1
ORDER  BY 1;

This aggregates a "rolling average" for the whole table. If you only need the result for "January 31" (of what year?), filter early to make the query much cheaper.

Aggregate functions can be nested in window functions. Consider the sequence of events in a SELECT query: