Pg 9.6. autovacuum on. Original problem was : we can't insert value because of pkey violation, but select doesn't return rows with same value.
posstorage=# \d system_parameter
Table "posstorage.system_parameter"
Column | Type | Modifiers
-------------+-------------------+-----------
name | character varying | not null
value | character varying |
description | character varying |
Indexes:
"system_parameter_pkey" PRIMARY KEY, btree (name) CLUSTER
"system_parameter$name$idx" btree (name)
posstorage=# select * from system_parameter;
name | value | description
--------------------------+-------------------+------------------------------------------------------------------------------------
BO_ACCOUNT_ID | 1 | Рдентификатор пользователя BO
STATUSHISTORY_QUERY_DAYS | 33 | Максимальный интервал просмотра исторических данных в таблице statushistory, дней.
1DEFAULT_OFD | 1ATOL_v4_stricted | 1Значение ОФД по умолчанию, если значение не задано в поле OFD таблицы ENTITY
posstorage=# insert into system_parameter values ('DEFAULT_OFD','xxx','xxx');
ERROR: duplicate key value violates unique constraint "system_parameter_pkey"
DETAIL: Key (name)=(DEFAULT_OFD) already exists.
After some research as we understand it wright - found that we have another row with this value that was commited , but is not visible and not getting Freeze after manual vaccuum , vacuum freeze or cluster.
posstorage=# select convert_to('DEFAULT_OFD', 'UTF8'); -- \x44454641554c545f4f4644
posstorage=# SELECT format('(0,%s)', lp) AS tid,
t_ctid AS replacement,
t_xmax <> 0 AS dead,
t_infomask2 & 16384 <> 0 AS hot_updated,
t_attrs[1] AS total
FROM heap_page_item_attrs(get_raw_page('system_parameter', 0), 'system_parameter') where t_attrs[1]='\x1944454641554c545f4f4644';
tid | replacement | dead | hot_updated | total
--------+-------------+------+-------------+----------------------------
(0,15) | (0,15) | t | f | \x1944454641554c545f4f4644
(0,16) | (0,16) | t | f | \x1944454641554c545f4f4644
(0,17) | (0,17) | t | f | \x1944454641554c545f4f4644
(0,18) | (0,18) | t | f | \x1944454641554c545f4f4644
(0,19) | (0,19) | t | f | \x1944454641554c545f4f4644
(0,21) | (0,21) | t | f | \x1944454641554c545f4f4644
(0,22) | (0,22) | f | f | \x1944454641554c545f4f4644 <-- this row is not dead but not visible now and seems to issue constraint violation
(7 rows)
Lets look deeper:
posstorage=# SELECT * FROM heap_page_items(get_raw_page('system_parameter',0)) where t_ctid='(0,22)' ;
-[ RECORD 1 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
lp | 22
lp_off | 4152
lp_flags | 1
lp_len | 190
t_xmin | 226114219
t_xmax | 0
t_field3 | 59
t_ctid | (0,22)
t_infomask2 | 3
t_infomask | 2306
t_hoff | 24
t_bits |
t_oid |
t_data | \x1944454641554c545f4f46442341544f4c5f76345f737472696374656400000018020000d097d0bdd0b0d187d0b5d0bdd0b8d0b520d09ed0a4d09420d0bfd0be20d183d0bcd0bed0bbd187d0b0d0bdd0b8d18e2c20d0b5d181d0bbd0b820d0b7d0bdd0b0d187d0b5d0bdd0b8d0b520d0bdd0b520d0b7d0b0d0b4d0b0d0bdd0be20d0b220d0bfd0bed0bbd0b5204f464420d182d0b0d0b1d0bbd0b8d186d18b20454e54495459
As we can see row was inserted at xmin = 226114219 and was not removed (t_xmax=0) and current txid is lower (94831127)
posstorage=# SELECT txid_current();
-[ RECORD 1 ]+---------
txid_current |
posstorage=# SELECT '(0,'||lp||')' AS ctid,
CASE lp_flags
WHEN 0 THEN 'unused'
WHEN 1 THEN 'normal'
WHEN 2 THEN 'redirect to '||lp_off
WHEN 3 THEN 'dead'
END AS state,
t_xmin as xmin,
t_xmax as xmax,
(t_infomask & 256) > 0 AS xmin_commited,
(t_infomask & 512) > 0 AS xmin_aborted,
(t_infomask & 1024) > 0 AS xmax_commited,
(t_infomask & 2048) > 0 AS xmax_aborted,
t_ctid
FROM heap_page_items(get_raw_page('system_parameter',0)) where t_ctid='(0,22)';
-[ RECORD 1 ]-+----------
ctid | (0,22)
state | normal
xmin | 226114219
xmax | 0
xmin_commited | t
xmin_aborted | f
xmax_commited | f
xmax_aborted | t
t_ctid | (0,22)
posstorage=# SELECT * FROM heap_page('system_parameter',0) where t_ctid='(0,22)';
ctid | state | xmin | xmax | t_ctid
--------+--------+---------------+-------+--------
(0,22) | normal | 226114219 (c) | 0 (a) | (0,22)
(1 row)
As we can see rows are not getting frozen after manual vacuum full,vacuum freeze or cluster
posstorage=# vacuum FREEZE verbose system_parameter ;
INFO: vacuuming "posstorage.system_parameter"
INFO: index "system_parameter_pkey" now contains 34 row versions in 2 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "system_parameter$name$idx" now contains 34 row versions in 2 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "system_parameter": found 0 removable, 34 nonremovable row versions in 1 out of 1 pages
DETAIL: 14 dead row versions cannot be removed yet.
There were 2 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_34423"
INFO: index "pg_toast_34423_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_34423": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
posstorage=# SELECT * FROM heap_page('system_parameter',0,0);
ctid | state | xmin | xmin_age | xmax | t_ctid
--------+--------+---------------+------------+---------------+--------
(0,1) | normal | 61312329 (f) | 33518801 | 0 (a) | (0,1)
(0,2) | normal | 94840819 (c) | -9689 | 0 (a) | (0,2)
(0,3) | normal | 94840820 (c) | -9690 | 0 (a) | (0,3)
(0,4) | normal | 94840821 (c) | -9691 | 0 (a) | (0,4)
(0,5) | normal | 94840822 (c) | -9692 | 0 (a) | (0,5)
(0,6) | normal | 94840823 (c) | -9693 | 0 (a) | (0,6)
(0,7) | normal | 94840824 (c) | -9694 | 0 (a) | (0,7)
(0,8) | normal | 207785422 (c) | -112954292 | 0 (a) | (0,8)
(0,9) | normal | 61312329 (f) | 33518801 | 207785422 (c) | (0,8)
(0,10) | normal | 94840826 (c) | -9696 | 0 (a) | (0,10)
(0,11) | normal | 94840827 (c) | -9697 | 0 (a) | (0,11)
(0,12) | normal | 94840828 (c) | -9698 | 0 (a) | (0,12)
(0,13) | normal | 94840829 (c) | -9699 | 0 (a) | (0,13)
(0,14) | normal | 94840997 (c) | -9867 | 0 (a) | (0,14)
(0,15) | normal | 226114167 (c) | -131283037 | 226114168 (c) | (0,15)
(0,16) | normal | 226114169 (c) | -131283039 | 226114207 (c) | (0,16)
(0,17) | normal | 226114208 (c) | -131283078 | 226114209 (c) | (0,17)
(0,18) | normal | 226114210 (c) | -131283080 | 226114214 (c) | (0,18)
(0,19) | normal | 226114215 (c) | -131283085 | 226114217 (c) | (0,19)
(0,20) | normal | 142584526 (c) | -47753396 | 0 (a) | (0,20)
(0,21) | normal | 195887902 (c) | -101056772 | 226114166 (c) | (0,21)
(0,22) | normal | 226114219 (c) | -131283089 | 0 (a) | (0,22) <- still not frozen
(0,23) | normal | 241684907 (c) | -146853777 | 0 (a) | (0,23)
(0,24) | normal | 195888393 (c) | -101057263 | 241684907 (c) | (0,23)
(0,25) | normal | 142584889 (c) | -47753759 | 195888393 (c) | (0,24)
(0,26) | normal | 241823539 (c) | -146992409 | 241824890 (c) | (0,26)
(0,27) | normal | 241843019 (c) | -147011889 | 241922453 (c) | (0,27)
(0,28) | normal | 241924322 (c) | -147093192 | 0 (a) | (0,28)
(0,29) | normal | 242116441 (c) | -147285311 | 0 (a) | (0,29)
(0,30) | normal | 241684906 (c) | -146853776 | 242116441 (c) | (0,29)
(0,31) | normal | 195888394 (c) | -101057264 | 241684906 (c) | (0,30)
(0,32) | normal | 142584888 (c) | -47753758 | 195888394 (c) | (0,31)
(0,33) | normal | 94829330 (f) | 1800 | 0 (a) | (0,33)
(0,34) | normal | 94829545 (f) | 1585 | 0 (a) | (0,34)
We checked pg_stat_activity(has no long running tx),no replication,feedback or prepared_tx, also restarted cluster.
ps: research are from https://habr.com/en/company/postgrespro/blog/487590/ and https://habr.com/en/company/postgrespro/blog/477648/
Tried manual vacuum full,vacuum freeze or cluster, want understand why old rows are not getting frozen and are not visible
That looks like you have experienced transaction ID wraparound, a form of data corruption. In PostgreSQL v14 or better, you could use the function
heap_force_freezefrom pg_surgery to mark the tuple as frozen and make it visible.In an old, unsupported version like 9.6, the best you can do is dump and restore to a recent PostgreSQL version and abandon the “vanished” tuples.