I have running Postgres instance (version 16.1) in AWS with one table that has following schema:
| Column | Type | Modifiers |
|---|---|---|
| user_id | integer | not null default nextval('profile_user_id_seq'::regclass) |
| last_seen | timestamp with time zone | not null |
| data | jsonb[] | not null |
There is primary key on user_id column (b-tree to ensure uniqueness) and also hash-index on same user_id column and additional b-tree index on last_seen.
data column is array of JSON documents with structure:
{
key1: str # ~64chars
key2: str # ~64chars
key3: int[128]
key4: str # datetime in iso format
}
On top of that, there is ongoing perf-test (running for several days) that:
- 150 times per second selects entire row for randomly chosen user
- 75 times per second there is update of a single random row.
- During that update,
last_seenis set to current datetime - Random item is appended to
dataarray - It is ensured that length
dataarray does not exceed 50 items:
- During that update,
INSERT INTO profile (user_id, last_seen, data)
VALUES
(%user_id, %last_seen, %data),
(%user_id, %last_seen, %data) ON CONFLICT ON CONSTRAINT pk_profile DO
UPDATE
SET
last_seen = %last_seen,
data = trim_array(
array_cat(excluded.data, profile.data),
greatest(
array_length(
array_cat(excluded.data, profile.data), 1
) - 50,
0
)
)
The automatic vaccum on the table is turned off and instead I have pg_cron job that runs VACUUM profile each day at 22PM.
But when I check AWS console and FreeStorageSpace metric, I see that every single vacuum run takes 2-5GB of disk space. The only way how to recover it is of course running VACUUM FULL.
During the day, when vacuum is not running, the free disk space is constant. That suggests that vacuum is doing it's job and it is marking dead tuples as re-usable again. Every single update therefore does not take new disk space and instead old rows marked as free during the vacuum run are re-used again.
I have three possible explanations why this is happening:
- Deletes and updates that happen during vacuum run can't be handled by vacuum
- Each vacuum run has some disk overhead that is not fully returned back to OS.
- Values in
datacolumn are stored as TOAST and when they items are added or deleted there is increasing fragmentation.
But all those explanations are very unlikely to me.
What is the real cause? Can this be prevented and ensured that if I have constant number of rows in the table and constant number of items in the data column, the used disk space will be also constant?
I tried to test both automatic Postgres vacuum and scheduled vacuum using pg_cron extension. I also tried to run the test for several days (3 weeks) with the assumption that free disk space might eventually stabilize.
