A table:
CREATE TABLE events_holder(
id serial primary key,
version int not null,
data jsonb not null
);
Data field can be very very large (up to 100 Mb) and looks like this:
{
"id": 5,
"name": "name5",
"events": [
{
"id": 255,
"name": "festival",
"start_date": "2022-04-15",
"end_date": "2023-04-15",
"values": [
{
"id": 654,
"type": "text",
"name": "importance",
"value": "high"
},
{
"id": 655,
"type": "boolean",
"name": "epic",
"value": "true"
}
]
},
{
"id": 256,
"name": "discovery",
"start_date": "2022-02-20",
"end_date": "2022-02-22",
"values": [
{
"id": 711,
"type": "text",
"name": "importance",
"value": "low"
},
{
"id": 712,
"type": "boolean",
"name": "specificAttribute",
"value": "false"
}
]
}
]
}
I want to select data field by version, but filtered with extra condition: where events end_date > '2022-03-15'. And the output must look like this:
{
"id": 5,
"name": "name5",
"events": [
{
"id": 255,
"name": "festival",
"start_date": "2022-04-15",
"end_date": "2023-04-15",
"values": [
{
"id": 654,
"type": "text",
"name": "importance",
"value": "high"
},
{
"id": 655,
"type": "boolean",
"name": "epic",
"value": "true"
}
]
}
]
}
How can I do this with maximum performance? How should I index the data field?
My primary solution:
with cte as (
select eh.id, eh.version, jsonb_agg(events) as filteredEvents from events_holder eh
cross join jsonb_array_elements(eh.data #> '{events}') as events
where version = 1 and (events ->> 'end_date')::timestamp >= '2022-03-15'::timestamp
group by id, version
)
select jsonb_set(data, '{events}', cte.filteredEvents) from events_holder, cte
where events_holder.id = cte.id;
But i don't think it's a good variant.
You can do this using a JSON path expression:
Given your example JSON, this returns:
Depending on your data distribution a GIN index on
dataor an index onversioncould help.If you need to re-construct the whole JSON content but with just a filtered
eventsarray, you can do something like this:(data - 'events')removes theeventskey from the json. Then the the result of the JSON path query is appended back to that (partial) object.