How to determine if a json field exists within an array in presto?

768 Views Asked by At

If I have a field jsonCol which has list of json objects, for example:

[{'name': 'fieldA', 'enum': 'someValA'},
 {'name': 'fieldB', 'enum': 'someValB'},
 {'name': 'fieldC', 'enum': 'someValC'}]

Another row may look like:

[{'name': 'fieldA', 'enum': 'someValA'},
 {'name': 'fieldC', 'enum': 'someValC'}]

How do I get rows where the fieldB exists? I have a query that can look for the value of fieldB, howver the query fails in cases when fieldB doesn't exist with the error:

Error running query: Array subscript must be less than or equal to array length: 1 > 0

My query:

SELECT
    json_extract_scalar(filter(cast(json_parse(jsonCol) AS array(json)), x -> json_extract_scalar(x, '$.name') = 'fieldB')[1], '$.enum') AS myField
FROM myTable
WHERE
    json_extract_scalar(filter(cast(json_parse(jsonCol) AS array(json)), x -> json_extract_scalar(x, '$.name') = 'fieldB')[1], '$.enum') = 'someValB'

How can I check for the value of someValB but also ignore cases when the json doesn't exist at all?

2

There are 2 best solutions below

4
SelVazi On

This is a working solution for SQLite.

json_each() used to returns row for each array element or object member.

json_extract() extracts and returns one or more values from json object.

with cte as (
  select key, value, json_extract(value,'$.name') as name, json_extract(value,'$.enum') as enum
  from myTable, json_each(jsonCol)
)
select enum
from cte
where name = 'fieldB';

Result :

enum
———————-
someValB

Demo here

0
Guru Stron On

Use element_at instead of array access with [1]:

-- sample data
with dataset (jsonCol) as (
    values ('[{"name": "fieldA", "enum": "someValA"},
 {"name": "fieldB", "enum": "someValB"},
 {"name": "fieldC", "enum": "someValC"}]'),
        ('[{"name": "fieldA", "enum": "someValA"},
 {"name": "fieldC", "enum": "someValC"}]')
)

-- query
SELECT
    json_extract_scalar(
        element_at(
            filter(cast(json_parse(jsonCol) AS array(json)), x -> json_extract_scalar(x, '$.name') = 'fieldB'),
            1)
        , '$.enum') AS myField
FROM dataset
WHERE
    json_extract_scalar(
        element_at(
            filter(cast(json_parse(jsonCol) AS array(json)), x -> json_extract_scalar(x, '$.name') = 'fieldB'),
            1)
        , '$.enum') = 'someValB';

Output:

 myField
----------
 someValB

Note that possibly you can simplify the query by moving some parts into WITH clause or subquery.

If you can switch to Trino - you can use it's improved json path support via json_query (or json_exists). TBH I don't fully understand your goal but here are some options:

-- query
SELECT
    JSON_QUERY(jsonCol, 'lax $[*]?(@.name == "fieldB" && @.enum == "someValB")[0].enum') myField,
    JSON_QUERY(jsonCol, 'lax $[*]?(@.name == "fieldB").enum' WITH ARRAY WRAPPER) myFieldArr
FROM dataset;

Output:

  myField   |  myFieldArr
------------+--------------
 "someValB" | ["someValB"]
 NULL       | NULL