How to do SQL recursive logic

170 Views Asked by At

I have this sample dataset like this ;

acctid payseq totalamount acctnum expectprice payorder
aac1 99 1000.00 1111aa 800.00 1
aac1 99 1000.00 1111bb 800.00 2
aac1 99 1000.00 1111cc 800.00 3
aac1 99 1000.00 1111dd 1200.00 4
aac1 100 2000.00 1111aa 800.00 1
aac1 100 2000.00 1111bb 800.00 2
aac1 100 2000.00 1111cc 800.00 3
aac1 100 2000.00 1111dd 1200.00 4

I want a Paid amount column where the total amount is distributed between acctnum for that acctid, payseq combination; the final result should look like this;

acctid payseq totalamount acctnum expectprice payorder paidamount
aac1 99 1000.00 1111aa 800.00 1 800.00
aac1 99 1000.00 1111bb 800.00 2 200.00
aac1 99 1000.00 1111cc 800.00 3
aac1 99 1000.00 1111dd 1200.00 4
aac1 100 2000.00 1111aa 800.00 1
aac1 100 2000.00 1111bb 800.00 2 600.00
aac1 100 2000.00 1111cc 800.00 3 800.00
aac1 100 2000.00 1111dd 1200.00 4 600.00

For the first payseq 99 - we distribute the total amount 1000 between the first 2 acctnumns (1111aa & 1111bb) based on the expectprice on each acctnum. For the second payseq 100 - we need to start the distribution from where the first one left, i.e second acctnum had 600 remaining so distribution of total amount of 2000 should be distributed among acctnums (1111bb, 1111cc & 1111dd). Like wise if there are more payseq's.

I tried using something like

totalamount - sum(ifnull(ExpectPrice,0.0)) 
over(partition by acctId, payseq order by payorder asc rows between unbounded preceding and 0 preceding) as PaidAmount

but this always starts distribution from first acctnum.

Can anyone help me build a SQL query using recursive CTE or otherwise to accomplish this logic?;

PS: I do not want to write it as stored procedure. I need this in SQL query only.

Have tried writing case statements etc but nothing seems to help!

1

There are 1 best solutions below

10
T N On

This was tricky. In the following discussion, I am going to use the terms "payout" and "payout amount" to refer to the {acctid, payseq, totalamount} data and "account" and "expected amount" to refer to the {acctnum, expectprice, payorder} data. "Allocated amount" is the calculated result.

For a single payout, the allocated amount from each account can be calculated by first reducing the payout amount by the maximum amount that might have been allocated from accounts with an earlier payorder. That adjustment is just the running sum up to the immediately prior row. That adjusted value is then clamped to the range [0, expectprice] of the current row. If it was negative, we had nothing left to allocate. If it exceeded expectprice, we will allocate the full expectprice amount and will still have some leftover portion to allocate later.

The problem with this is that this only works for a single payout. If we attempt this for multiple payouts, the numbers reset each time, and we keep allocating the same amounts from the same accounts, not adjusting for already allocated or depleted accounts.

The fix for this was tricky. We need to start out by adjusting the payout amount so that it includes all prior payout amounts. Effectively the current running sum of payout amounts. Now we apply the previously discussed allocation algorithm to the current adjusted payout. This will force the allocations to advance past the prior high water mark and allocate new money from remaining account funds.

This is good, but we still have a problem. The current allocations also reallocate all previously allocated account funds. How do we fix that? The answer turns out to be surprisingly simple (well maybe) - we just perform the same allocation calculations for the prior cumulative payout sum and subtract those results from the current calculations. This will leave just the current allocations with no overlap.

The resulting code is:

WITH CTE_Payout AS (
    SELECT *,
        SUM(totalamount) OVER(ORDER BY acctid, payseq) AS CummulativeTotalAmount,
        ISNULL(SUM(totalamount) OVER(
            ORDER BY acctid, payseq
            ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
            ), 0) AS PriorCummulativeTotalAmount
    FROM Payout P
),
CTE_Acct AS (
    SELECT *,
        ISNULL(SUM(expectprice) OVER(
            ORDER BY payorder
            ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
            ), 0) AS PriorCummulativeExpectedPrice
    FROM Account A
)
SELECT
    P.acctid, P.payseq, P.totalamount,
    A.acctnum, A.expectprice, A.payorder,
    LEAST(GREATEST(
            P.CummulativeTotalAmount - A.PriorCummulativeExpectedPrice,
        0), A.expectprice)
        - LEAST(GREATEST(
            P.PriorCummulativeTotalAmount - A.PriorCummulativeExpectedPrice,
        0), A.expectprice)
        AS paidamount,
    -- The following just show the working calculations
    P.CummulativeTotalAmount,
    LEAST(GREATEST(
            P.CummulativeTotalAmount - A.PriorCummulativeExpectedPrice,
        0), A.expectprice)
        AS CurrentCummulativeAllocation,
    P.PriorCummulativeTotalAmount,
    LEAST(GREATEST(
            P.PriorCummulativeTotalAmount - A.PriorCummulativeExpectedPrice,
        0), A.expectprice)
        AS PriorCummulativeAllocation
FROM CTE_Payout P
CROSS APPLY CTE_Acct A
ORDER BY acctid, payseq, payorder

Results (with intermediate calculations):

acctid payseq totalamount acctnum expectprice payorder paidamount Cummulative
Total Amount
Cummulative
Allocation
Prior
Cummulative
Total Amount
Prior
Cummulative
Allocation
aac1 99 1000.00 1111aa 800.00 1 800.00 1000.00 800.00 0.00 0.00
aac1 99 1000.00 1111bb 800.00 2 200.00 1000.00 200.00 0.00 0.00
aac1 99 1000.00 1111cc 800.00 3 0.00 1000.00 0.00 0.00 0.00
aac1 99 1000.00 1111dd 1200.00 4 0.00 1000.00 0.00 0.00 0.00
aac1 100 2000.00 1111aa 800.00 1 0.00 3000.00 800.00 1000.00 800.00
aac1 100 2000.00 1111bb 800.00 2 600.00 3000.00 800.00 1000.00 200.00
aac1 100 2000.00 1111cc 800.00 3 800.00 3000.00 800.00 1000.00 0.00
aac1 100 2000.00 1111dd 1200.00 4 600.00 3000.00 600.00 1000.00 0.00

Note that paidamount = CummulativeAllocation - PriorCummulativeAllocation.

It wouldn't take much to wrap the final select above into a subquery or another CTE and apply a filter WHERE paidamount <> 0 to eliminate unnecessary results.

See this db<>fiddle for a demo.