Calculate average duration over time span series

69 Views Asked by At

I have a table that contains call detailed records with the following structure (simplified):

CREATE TABLE public.cdr (
    call_id int8,
    start_date timestamp,
    duration_ms int4
)

I need to calculate average call duration per time range:

enter image description here

So the result table look something like this.

range_start            |avg_duration_ms | number_of_calls_in_range
-----------------------+----------------+-------------------------
2023-05-15 15:00:00.000|    65230       |  12
2023-05-15 15:05:00.000|    3450        |  67
2023-05-15 15:10:00.000|    28329       |  25

I can't comprehend how to get an average over the part of the call that falls within a certain range, rather than the total call duration.

1

There are 1 best solutions below

0
Frank Heikens On

This is much easier to do when using range data types, like tsrange. You need a start- and enddate for the call and then you can calculate the time spent in each frame.

Something like this:

CREATE TABLE public.cdr (
    call_id int8,
    call_range tsrange
);

INSERT INTO public.cdr(call_id, call_range)
VALUES (1, tsrange('2024-01-30 11:59','2024-01-30 12:03','[)'))
        ,(2,tsrange('2024-01-30 12:01','2024-01-30 12:04','[)'))
        ,(3,tsrange('2024-01-30 12:01','2024-01-30 12:06','[)'))
        ,(3,tsrange('2024-01-30 12:05','2024-01-30 12:07','[)'))
;

SELECT ts
    , AVG(least(ts + interval '5 minutes', upper(call_range))
        - CASE
            WHEN lower(call_range) IS NOT NULL
                THEN greatest(ts, lower(call_range))
            ELSE NULL
            END)
    , SUM(least(ts + interval '5 minutes', upper(call_range))
        - CASE
            WHEN lower(call_range) IS NOT NULL
                THEN greatest(ts, lower(call_range))
            ELSE NULL
            END)
    ,   count(call_id)
FROM    public.cdr
    RIGHT JOIN generate_series('2024-01-30 11:45'::timestamp
                   , '2024-01-30 12:15'::timestamp
                   , interval '5 minutes'
               ) g(ts)
        ON call_range && tsrange(ts,ts + interval '5 minutes','[)') -- overlap
GROUP BY ts
ORDER BY ts;

You can change the interval result into seconds or milliseconds, whatever fits best.

Result:

frame avg sum count
2024-01-30 11:45:00.000000 0
2024-01-30 11:50:00.000000 0
2024-01-30 11:55:00.000000 0 years 0 mons 0 days 0 hours 1 mins 0.0 secs 0 years 0 mons 0 days 0 hours 1 mins 0.0 secs 1
2024-01-30 12:00:00.000000 0 years 0 mons 0 days 0 hours 3 mins 20.0 secs 0 years 0 mons 0 days 0 hours 10 mins 0.0 secs 3
2024-01-30 12:05:00.000000 0 years 0 mons 0 days 0 hours 1 mins 30.0 secs 0 years 0 mons 0 days 0 hours 3 mins 0.0 secs 2
2024-01-30 12:10:00.000000 0
2024-01-30 12:15:00.000000 0