How to turn an operation and timestamp in rows into separate columns using PL SQL

53 Views Asked by At

I tried using sql pivot, but it's not working, getting null value.

Before data:

NAME    OPERATION   TIMESTAMP
A       START      18/01/2024 8:24
A       START      18/01/2024 8:27
A       START      18/01/2024 8:30
A       END        18/01/2024 8:34
B       START      18/01/2024 9:28
B       END        18/01/2024 9:32
B       START      18/01/2024 9:37
B       END        18/01/2024 9:39
C       START      18/01/2024 10:14
C       END        18/01/2024 10:18

After data (What I need):

NAME    START_EVENT START_TIME       END_EVENT  END_TIME         DURATION
A       START       18/01/2024 8:24  END        18/01/2024 8:34  0.006944444
B       START       18/01/2024 9:28  END        18/01/2024 9:32  0.002777778
B       START       18/01/2024 9:37  END        18/01/2024 9:39  0.001388889
C       START       18/01/2024 10:14 END        18/01/2024 10:18 0.002777778
2

There are 2 best solutions below

0
MT0 On

From Oracle 12, you can use MATCH_RECOGNIZE for row-by-row pattern matching:

SELECT name,
       start_event,
       start_time,
       end_event,
       end_time,
       end_time - start_time AS duration
FROM   table_name
       MATCH_RECOGNIZE(
         PARTITION BY name
         ORDER BY timestamp
         MEASURES
           FIRST(operation) AS start_event,
           FIRST(timestamp) AS start_time,
           LAST(operation)  AS end_event,
           LAST(timestamp)  AS end_time
         PATTERN (start_op+ end_op)
         DEFINE
           start_op AS operation = 'START',
           end_op   AS operation = 'END'
       )

Which, for the sample data:

CREATE TABLE table_name (NAME, OPERATION, TIMESTAMP) AS
SELECT 'A', 'START', DATE '2024-01-18' + INTERVAL '08:24:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 'A', 'START', DATE '2024-01-18' + INTERVAL '08:27:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 'A', 'START', DATE '2024-01-18' + INTERVAL '08:30:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 'A', 'END',   DATE '2024-01-18' + INTERVAL '08:34:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 'B', 'START', DATE '2024-01-18' + INTERVAL '09:28:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 'B', 'END',   DATE '2024-01-18' + INTERVAL '09:32:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 'B', 'START', DATE '2024-01-18' + INTERVAL '09:37:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 'B', 'END',   DATE '2024-01-18' + INTERVAL '09:39:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 'C', 'START', DATE '2024-01-18' + INTERVAL '10:14:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 'C', 'END',   DATE '2024-01-18' + INTERVAL '10:18:00' HOUR TO SECOND FROM DUAL;

Outputs:

NAME START_EVENT START_TIME END_EVENT END_TIME DURATION
A START 2024-01-18 08:24:00 END 2024-01-18 08:34:00 .006944444444444444444444444444444444444444
B START 2024-01-18 09:28:00 END 2024-01-18 09:32:00 .002777777777777777777777777777777777777778
B START 2024-01-18 09:37:00 END 2024-01-18 09:39:00 .001388888888888888888888888888888888888889
C START 2024-01-18 10:14:00 END 2024-01-18 10:18:00 .002777777777777777777777777777777777777778

fiddle

0
SelVazi On

You're facing an issue known as the gaps and islands problem. To address it, we'll assign a unique ID to each consecutive group of rows, indicating their start and end. This can be achieved by leveraging the difference between the row number and the cumulative total :

WITH cte AS (
  SELECT name, operation, timestamp,
       ROW_NUMBER() OVER (PARTITION BY name ORDER BY timestamp DESC) 
       - SUM(CASE WHEN operation = 'START' THEN 1 ELSE 0 END) OVER (PARTITION BY name ORDER BY timestamp DESC) as grp
  FROM mytable
)
SELECT name, 
       MIN(CASE WHEN OPERATION = 'START' THEN 'START' END) AS START_EVENT,
       MIN(CASE WHEN OPERATION = 'START' THEN TIMESTAMP END) AS START_TIME,
       MIN(CASE WHEN OPERATION = 'END' THEN 'END' END) AS END_EVENT,
       MIN(CASE WHEN OPERATION = 'END' THEN TIMESTAMP END) AS END_TIME,
       MIN(CASE WHEN OPERATION = 'END' THEN TIMESTAMP END) 
       - MIN(CASE WHEN OPERATION = 'START' THEN TIMESTAMP END) AS duration
FROM cte
GROUP BY name, grp
ORDER BY name, START_EVENT

Results :

NAME START_EVENT START_TIME END_EVENT END_TIME DURATION
A START 18/01/2024 08:24:00 END 18/01/2024 08:34:00 +000000000 00:10:00.000000
B START 18/01/2024 09:28:00 END 18/01/2024 09:32:00 +000000000 00:04:00.000000
B START 18/01/2024 09:37:00 END 18/01/2024 09:39:00 +000000000 00:02:00.000000
C START 18/01/2024 10:14:00 END 18/01/2024 10:18:00 +000000000 00:04:00.000000

Demo here