Calculate subtotal with special condition

66 Views Asked by At
id flow amount date
1 IN 100 2023-08-01
2 OUT 100 2023-08-02
3 IN 100 2023-08-03
4 OUT 100 2023-08-04
5 OUT 100 2023-08-05

In a table like above, I want to sum all OUT transactions which take place after every IN transaction. The desired outcome should look like below:

id flow amount date out_amount
1 IN 100 2023-08-01 100
3 IN 100 2023-08-03 200

How could this be achieved?

3

There are 3 best solutions below

0
Guru Stron On BEST ANSWER

If you are using Presto then you can follow gaps-and-islands approach (based on provided data I would use id as ordering field, but for actual you might need/want to use date):

-- sample data
with dataset(id, flow, amount, date) as (
    values (1,  'IN',   100,    '2023-08-01'),
    (2, 'OUT',  100,    '2023-08-02'),
    (3, 'IN',   100,    '2023-08-03'),
    (4, 'OUT',  100,    '2023-08-04'),
    (5, 'OUT',  100,    '2023-08-05'),
    (6, 'IN',   100,    '2023-08-11')
)

-- query
select min(id) id,
    min_by(flow, id) flow,
    min_by(amount, id) amount,
    min_by(date, id) date,
    sum(if(flow = 'OUT', amount)) out
from (select *,
             count_if(flow = 'IN') over (order by id) as grp
      from dataset)
group by grp
order by id;

Output:

id flow amount date out
1 IN 100 2023-08-01 100
3 IN 100 2023-08-03 200
6 IN 100 2023-08-11 NULL

For Trino you can also use MATCH_RECOGNIZE:

-- query
SELECT *
FROM dataset MATCH_RECOGNIZE(
     ORDER BY id
     MEASURES
         A.id as id,
         A.flow as flow,
         A.amount as amount,
         A.date as date,
         SUM(B.amount) AS out
     ONE ROW PER MATCH
     AFTER MATCH SKIP PAST LAST ROW
     PATTERN (A B*)
     DEFINE
            A AS flow = 'IN',
            B AS flow = 'OUT'
     );
0
shawnt00 On

Standard gaps and islands. Tag with the number of IN rows and then use grouping.

with data as (
    select *,
        count(case when flow = 'IN' then 1 end) over (order by date) as grp
    from T
)
select
    min(id) as id,
    'IN' as flow,
    min(case when flow = 'IN' then amount end) as amount,
    min(date) as date,
    sum(case when flow = 'OUT' then amount end) as out_amount
from data
group by grp
0
Ajax1234 On

Using a subquery with a cte:

with cte as (
   select t.*, (select sum(t1.flow != t.flow and t1.id < t.id) 
   from transactions t1) r from transactions t
)
select t2.id, t2.flow, t2.amount, t2.date, t1.amount out_amount 
from (select t.r, sum(t.amount) amount from cte t where t.flow = 'OUT' group by t.r) t1
join cte t2 on t2.flow = 'IN' and t2.r = t1.r - 1