Extract values from a list inside a nested json dictionary in PostgreSQL

43 Views Asked by At

I have a table in PostgreSQL which looks like:

+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | result                                                                                                                                                                                                              |
+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1  | {"condition": "OR", "duplicate_ids": {"book_name": {"value": "-", "duplicates": ["afec4e99", "4c86040b"]}, "author": {"value": "-", "duplicates": ["4c86040b", "362a883d", "7856b483", "7b091646", "5c99eea3", "0fa4f47f"]}}} |
| 1  | {"condition": "OR", "duplicate_ids": {"author": {"value": "some_name", "duplicates": ["69e139df"]}}}                                                                                                                |
| 2  | {"condition": "OR", "duplicate_ids": {"author": {"value": "other_name", "duplicates": ["7fa13aa8"]}}}                                                                                                               |
| 3  | {"condition": "OR", "duplicate_ids": {"publisher": {"value": "publisher_name", "duplicates": ["2b69af3d"]}}}                                                                                                        |
+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I am attempting to extract all unique values that are part of the duplicates list for id = 1. The challenge is that the key inside duplicate_ids will change frequently, meaning it has to be dynamic. In this example, I don't know beforehand that the keys inside duplicate_ids are book_name and author for id = 1.

So, the output should be:

+------------+
| duplicates |
+------------+
| afec4e99   |
| 4c86040b   |
| 362a883d   |
| 7856b483   |
| 7b091646   |
| 5c99eea3   |
| 0fa4f47f   |
| 69e139df   |
+------------+

Is there any way of achieving this using PostgreSQL query?

1

There are 1 best solutions below

1
SelVazi On BEST ANSWER

You can use jsonb_each() to extract all elements as key/value pairs. Then you can get the duplicates element :

This is to get duplicates element as JSONB array :

SELECT
    e.value -> 'duplicates'
FROM mytable t
CROSS JOIN jsonb_each(result->'duplicate_ids') e
WHERE t.id = 1

Results :

["4c86040b", "362a883d", "7856b483", "7b091646", "5c99eea3", "0fa4f47f"]
["afec4e99", "4c86040b"]
["69e139df"]

This is to convert the JSONB array into individual rows by using the jsonb_array_elements_text() function :

SELECT jsonb_array_elements_text(e.value -> 'duplicates') AS duplicates 
FROM mytable t
CROSS JOIN jsonb_each(result->'duplicate_ids') e 
WHERE t.id = 1;

Results :

duplicates
4c86040b
362a883d
7856b483
7b091646
5c99eea3
0fa4f47f
afec4e99
4c86040b
69e139df

Demo here