Re: HOT chain validation in verify_heapam()

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Himanshu Upadhyaya <upadhyaya(dot)himanshu(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Subject: Re: HOT chain validation in verify_heapam()
Date: 2023-03-23 16:49:10
Message-ID: CA+TgmoZQGw0A2eS7-uxUoTwSFkDTYpPRFjMm6_Q351kWSic3Xw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 22, 2023 at 5:42 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> However, this "second pass over page" loop has roughly the same
> problem as the nearby HeapTupleHeaderIsHotUpdated() coding pattern: it
> doesn't account for the fact that a tuple whose xmin was
> XID_IN_PROGRESS a little earlier on may not be in that state once we
> reach the second pass loop. Concurrent transaction abort needs to be
> accounted for. The loop needs to recheck xmin status, at least in the
> initially-XID_IN_PROGRESS-xmin case.

I don't understand why it would need to do that. If the transaction
has subsequently committed, it doesn't change anything: we'll get the
same report we would have gotten anyway. If the transaction has
subsequently aborted, we'll get a report about corruption that would
not have been reported if the abort had occurred slightly earlier.
However, the abort doesn't remove the corruption, just our ability to
detect it.

Consider a page where TID 1 is a redirect to TID 4; TID 2 is dead; and
TIDs 3 and 4 are heap-only tuples. Any other line pointers on the page
are unused. The only way this can validly happen is if there was a
tuple at TID 2 and it got updated to produce the tuple at TID 3 and
then that transaction aborted. Then it got updated again and produced
the tuple at TID 4 and that transaction was committed. But this
implies that the xmin of TID 3 must be aborted. If we observe that
it's in-progress, we know that the transaction that created TID 3 was
still running after TID 4 had already shown up, which should be
impossible, and so it's fair to report corruption. If the xmin of TID
3 then goes on to abort, a future attempt to verify this page won't be
able to notice the corruption any more, because it won't be able to
prove that TID 3's xmin aborted after TID 4's xmin committed. But a
current attempt to verify this page that has seen TID 3's xmin as
in-progress at any point after locking the page knows for sure that
TID 4 showed up before TID 3's inserter aborted, and that's
inconsistent with any legal order of operations.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-03-23 16:50:20 Re: POC: Lock updated tuples in tuple_update() and tuple_delete()
Previous Message Daniel Gustafsson 2023-03-23 16:24:36 Re: Should vacuum process config file reload more often