How to select filtered postgresql jsonb field with performance prioritization?

Question

How to select filtered postgresql jsonb field with performance prioritization?

345 Views Asked by Evangilit At 01 August 2022 at 17:16

A table:

CREATE TABLE events_holder(
    id serial primary key,
    version int not null,
    data jsonb not null
);

Data field can be very very large (up to 100 Mb) and looks like this:

{
  "id": 5,
  "name": "name5",
  "events": [
    {
      "id": 255,
      "name": "festival",
      "start_date": "2022-04-15",
      "end_date": "2023-04-15",
      "values": [
        {
          "id": 654,
          "type": "text",
          "name": "importance",
          "value": "high"
        },
        {
          "id": 655,
          "type": "boolean",
          "name": "epic",
          "value": "true"
        }
      ]
    },
    {
      "id": 256,
      "name": "discovery",
      "start_date": "2022-02-20",
      "end_date": "2022-02-22",
      "values": [
        {
          "id": 711,
          "type": "text",
          "name": "importance",
          "value": "low"
        },
        {
          "id": 712,
          "type": "boolean",
          "name": "specificAttribute",
          "value": "false"
        }
      ]
    }
  ]
}

I want to select data field by version, but filtered with extra condition: where events end_date > '2022-03-15'. And the output must look like this:

{
  "id": 5,
  "name": "name5",
  "events": [
    {
      "id": 255,
      "name": "festival",
      "start_date": "2022-04-15",
      "end_date": "2023-04-15",
      "values": [
        {
          "id": 654,
          "type": "text",
          "name": "importance",
          "value": "high"
        },
        {
          "id": 655,
          "type": "boolean",
          "name": "epic",
          "value": "true"
        }
      ]
    }
  ]
}

How can I do this with maximum performance? How should I index the data field?

My primary solution:

with cte as (
  select eh.id, eh.version, jsonb_agg(events) as filteredEvents from events_holder eh
  cross join jsonb_array_elements(eh.data #> '{events}') as events
  where version = 1 and (events ->> 'end_date')::timestamp >= '2022-03-15'::timestamp
  group by id, version
)
select jsonb_set(data, '{events}', cte.filteredEvents) from events_holder, cte 
where events_holder.id = cte.id;

But i don't think it's a good variant.

Original Q&A

There are 1 best solutions below

**AudioBubble** · Answer 1 · 2022-08-01T18:54:27.333000

You can do this using a JSON path expression:

select eh.id, eh.version, 
       jsonb_path_query_array(data, 
                              '$.events[*] ? (@.end_date.datetime() >= "2022-03-15".datetime())')
from events_holder eh
where eh.version = 1
  and eh.data @? '$.events[*] ? (@.end_date.datetime() >= "2022-03-15".datetime())'

Given your example JSON, this returns:

[
    {
        "id": 255,
        "name": "festival",
        "values": [
            {
                "id": 654,
                "name": "importance",
                "type": "text",
                "value": "high"
            },
            {
                "id": 655,
                "name": "epic",
                "type": "boolean",
                "value": "true"
            }
        ],
        "end_date": "2023-04-15",
        "start_date": "2022-04-15"
    }
]

Depending on your data distribution a GIN index on data or an index on version could help.

If you need to re-construct the whole JSON content but with just a filtered events array, you can do something like this:

select (data - 'events')||
        jsonb_build_object('events', jsonb_path_query_array(data, '$.events[*] ? (@.end_date.datetime() >= "2022-03-15".datetime())'))
from events_holder eh
...

(data - 'events') removes the events key from the json. Then the the result of the JSON path query is appended back to that (partial) object.

How to select filtered postgresql jsonb field with performance prioritization?

There are 1 best solutions below

Related Questions in POSTGRESQL

Related Questions in JSONB

Related Questions in POSTGRESQL-JSON

Trending Questions

Popular # Hahtags

Popular Questions