Connect by prior - is this an Oracle bug?

67 Views Asked by At

So either this is an Oracle bug, or I'm being a bit slow today.

This SQL executes fine:

WITH car_paint_options AS (
   SELECT 'Escort'  car_model, 'red,blue' paint_opts FROM dual UNION
   SELECT 'Puma'    car_model, 'black'    paint_opts FROM dual
)
SELECT row_number() over(order by level) rn, level, ep.car_model,
       regexp_substr(ep.paint_opts, '[^,]+', 1, level) paint_opt
FROM   car_paint_options ep
CONNECT BY regexp_substr (ep.paint_opts, '[^,]+', 1, level) is not null

However, it gives me the wrong answer (4 lines of data):

 rn level car_model paint_opt
--- ----- --------- ---------
  1     1 Puma      black
  2     1 Focus     red
  3     2 Focus     blue
  4     2 Focus     blue

The desired output is only 3 lines of data as shown here:

 rn level car_model paint_opt
--- ----- --------- ---------
  1     1 Puma      black
  2     1 Focus     red
  3     2 Focus     blue

I understand why the problem occurs. Level 2 records are attempting to connect back to level 1 records, and what's happening is that the Focus:blue option is matching back successfully to both Puma:black and Focus:red parent rows.

So now I'm thinking: "fine, simple enough fix, let's constrain the level 2 objects so that they only ever link back to parent objects of the same car_model":

WITH car_paint_options AS (
   SELECT 'Escort'  car_model, 'red,blue' paint_opts FROM dual UNION
   SELECT 'Puma'    car_model, 'black'    paint_opts FROM dual
)
SELECT row_number() over(order by level) rn, level, ep.car_model,
       regexp_substr(ep.paint_opts, '[^,]+', 1, level) paint_opt
FROM   car_paint_options ep
CONNECT BY regexp_substr (ep.paint_opts, '[^,]+', 1, level) is not null
       AND ep.car_model = prior ep.car_model

But it results in an error:

ORA-01436: CONNECT BY loop in user data

Tested in both Oracle 12c + 19c. Can someone confirm I'm not doing anything silly here? Am I right in thinking it's a bug?

2

There are 2 best solutions below

1
MT0 On BEST ANSWER

Option 1

The normal "hack" for this is to add something to the CONNECT BY clause that gives each row a unique value within a filter that always evaluates to true such as SYS_GUID() IS NOT NULL or DBMS_RANDOM.VALUE() IS NOT NULL and prevents the hierarchical query from detecting cycles:

WITH car_paint_options (car_model, paint_opts) AS (
   SELECT 'Escort', 'red,blue' FROM dual UNION ALL
   SELECT 'Puma',   'black'    FROM dual
)
SELECT row_number() over(order by level) rn, level, ep.car_model,
       regexp_substr(ep.paint_opts, '[^,]+', 1, level) paint_opt
FROM   car_paint_options ep
CONNECT BY regexp_substr (ep.paint_opts, '[^,]+', 1, level) is not null
AND        ep.car_model = prior ep.car_model
AND        PRIOR SYS_GUID() IS NOT NULL

Which outputs:

RN LEVEL CAR_MODEL PAINT_OPT
1 1 Escort red
2 1 Puma black
3 2 Escort blue

Option 2

Regular expression functions are slow and it is often faster to use simple string functions, even if it means you need to type more (and if simple string functions are faster than regular expressions then they are even faster than regular expressions plus generating a GUID for each row):

WITH car_paint_options (car_model, paint_opts) AS (
   SELECT 'Escort', 'red,blue' FROM dual UNION ALL
   SELECT 'Puma',   'black'    FROM dual
),
bounds (car_model, paint_opts, lvl, spos, epos) AS (
  SELECT car_model,
         paint_opts,
         1,
         1,
         INSTR(paint_opts, ',', 1)
  FROM   car_paint_options
UNION ALL
  SELECT car_model,
         paint_opts,
         lvl + 1,
         epos + 1,
         INSTR(paint_opts, ',', epos + 1)
  FROM   bounds
  WHERE  epos > 0
)
SELECT ROW_NUMBER() OVER (ORDER BY lvl) AS rn,
       lvl,
       car_model,
       CASE epos
       WHEN 0
       THEN SUBSTR(paint_opts, spos)
       ELSE SUBSTR(paint_opts, spos, epos - spos)
       END AS paint_opt
FROM   bounds

Which outputs the same.

RN LVL CAR_MODEL PAINT_OPT
1 1 Escort red
2 1 Puma black
3 2 Escort blue

Option 3

If you did want to still use regular expressions and want to avoid SYS_GUID hacks then, from Oracle 12, you can generate the level values using a LATERAL join (or CROSS APPLY):

WITH car_paint_options (car_model, paint_opts) AS (
   SELECT 'Escort', 'red,blue' FROM dual UNION ALL
   SELECT 'Puma',   'black'    FROM dual
)
SELECT row_number() over(order by depth) rn,
       depth,
       ep.car_model,
       regexp_substr(ep.paint_opts, '[^,]+', 1, depth) AS paint_opt
FROM   car_paint_options ep
       CROSS JOIN LATERAL (
         SELECT LEVEL AS depth
         FROM   DUAL
         CONNECT BY LEVEL <= REGEXP_COUNT(ep.paint_opts, '[^,]+')
       )

Which also outputs the same.


Performance

A comparison of the performance of different options for splitting delimited strings is given in this answer. The summary is that option 2 tends to be the most performant (and the most to type).

fiddle

1
Amir Kadyrov On

For earlier versions of Oracle that do not support lateral joins. Lateral join was introduced in Oracle 12c.

WITH car_paint_options (car_model, paint_opts) AS (
   SELECT 'Escort', 'red,blue' FROM dual UNION ALL
   SELECT 'Puma',   'black'    FROM dual
)
select ep.*, regexp_substr(ep.paint_opts, '[^,]+', 1, column_value) c_value
from car_paint_options ep,
table(select cast (collect(level) as sys.odcinumberlist) lvl
from dual
connect by level <= regexp_count(ep.paint_opts, '[^,]+')
) t;

CAR_MODEL PAINT_OPTS C_VALUE
--------- ---------- --------
Escort    red,blue   red
Escort    red,blue   blue
Puma      black      black