From: | Evgeny Voropaev <evgeny(dot)voropaev(at)tantorlabs(dot)com> |
---|---|
To: | y(dot)sokolov(at)postgrespro(dot)ru, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Add 64-bit XIDs into PostgreSQL 15 |
Date: | 2025-07-01 11:08:12 |
Message-ID: | 5cba1cf8-28e2-4309-b6c6-863400f9bed9@tantorlabs.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Yuriy, thank you for the explanation!
> But they are not allowed to move tuples because concurrent
> backends allowed to read tuples from the page in exactly same moment.
Now I understand that we cannot repair fragmentation if we don’t have a
buffer cleanup lock on the page.
Nevertheless, using the heap_page_prune_and_freeze function without
repairing fragmentation still leads to the inconsistency problem, see below.
> No! Because patch uses flag in WAL record to instruct "redo"-side to omit
> fragmentation as well if needed.
Shortly, using the XLHP_REPAIR_FRAGMENTATION flag equal to `false` is
the direct road to page inconsistency.
Explaining thoroughly, in the case of omitting fragmentation we can end
up with an inconsistent page state on "redo"-side. Please, see the
attachment #1, depicting further explanation. Inconsistency can occur
because of a difference between functions heap_page_prune_and_freeze
("do"-side) and heap_xlog_prune_freeze ("redo"-side). The difference is
that the former function performs heap_prune_satisfies_vacuum before
pruning, and the latter one does not perform it. It can result in the
next situation:
DO-side:
heap_page_prune_and_freeze =>
1) heap_prune_satisfies_vacuum => HEAP_XMAX_COMMITTED=1 for
some tuples
2) heap_page_prune_execute => prune some tuples
3) Do not repair fragmentation
Result: garbage on the page comprising of pruned tuples having
HEAP_XMAX_COMMITTED=1
REDO-side:
heap_xlog_prune_freeze =>
2) heap_page_prune_execute => prune tuple/tuples
3) Do not repair fragmentation
Result: garbage with pruned tuples having HEAP_XMAX_COMMITTED=0
And that difference in the HEAP_XMAX_COMMITED is the inconsistency, even
when it is in garbage on a page. And, probably, it is not the only
example of the problem. I discovered this situation at xid64v58 since I
had invoked heap_page_prune_and_freeze(repairFragmentation=FALSE)` from
everywhere I used it.
In regard to the xid64v61-v63, it invokes
`heap_page_prune_and_freeze(repairFragmentation=TRUE)` from all places
of code. But the patch still invokes
`heap_page_prune_and_freeze(repairFragmentation=FALSE)` from the
`freeze_single_heap_page`. And we can potentially lead a page to
inconsistency here. If we cannot, please, tell us why we cannot.
Unfortunately, I have not made a test revealing this problem.
Attachment | Content-Type | Size |
---|---|---|
inconsistency_while_repairFragmentation_is_false.png | image/png | 517.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Álvaro Herrera | 2025-07-01 11:39:05 | Re: Inconsistent LSN format in pg_waldump output |
Previous Message | Tomas Vondra | 2025-07-01 11:03:28 | Re: pgsql: Introduce pg_shmem_allocations_numa view |