heap/SLRU verification, relfrozenxid cut-off, and freeze-the-dead bug (Was: amcheck (B-Tree integrity checking tool))

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, "Wood, Dan" <hexpert(at)amazon(dot)com>, "Wong, Yi Wen" <yiwong(at)amazon(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: heap/SLRU verification, relfrozenxid cut-off, and freeze-the-dead bug (Was: amcheck (B-Tree integrity checking tool))
Date: 2017-10-10 00:19:11
Message-ID: CAH2-Wzkb9GxDBdjJmC7FJ+tDmWvsc12R2X=4VOVhwfEmHuWfJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 16, 2016 at 6:46 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>> Noah and I discussed possible future directions for amcheck in person
>> recently. I would like to get Noah's thoughts again here on how a tool
>> like amcheck might reasonably target other access methods for
>> verification. In particular, the heapam. MultiXacts were mentioned as
>> a structure that could receive verification in a future iteration of
>> this tool, but I lack expertise there.
>
> Yes, a heap checker could examine plenty of things. Incomplete list:

The heap checking enhancement that is under review in the next CF [1],
which checks an index against the table it indexes, doesn't directly
test any of the things you outlined in this mail almost exactly a year
ago. (That said, some of the same things are checked indirectly,
simply by using IndexBuildHeapScan().)

I do still think that there is a very real need for a "direct
heap/SLRU checker" function, though. I only put off implementing one
because there was still some low hanging fruit. I would like to go
over your suggestions again now. My understanding of what we *can* do
has evolved in the past several months.

> - Detect impossible conditions in the hint bits. A tuple should not have both
> HEAP_XMAX_COMMITTED and HEAP_XMAX_INVALID. Every tuple bearing
> HEAP_ONLY_TUPLE should bear HEAP_UPDATED. HEAP_HASVARWIDTH should be true
> if and only if the tuple has a non-NULL value in a negative-typlen column,
> possibly a dropped column. A tuple should not have both HEAP_KEYS_UPDATED
> and HEAP_XMAX_LOCK_ONLY.

It certainly makes sense to check for clearly contradictory hint bits like this.

> - Verify agreement between CLOG, MULTIXACT, and hint bits.

This is where it gets complicated, I think. This is what I really want
to talk about.

The recent commit 20b65522 pretty clearly established that a tuple
could technically escape freezing (having HEAP_XMIN_FROZEN set)
despite actually being before a relfrozenxid cut-off. The idea was
that as long as we reliably set hint bits on heap-only tuples through
WAL-logging, it doesn't matter that their CLOG could be truncated,
because nobody will ever need to interrogate the CLOG anyway (to coin
a phrase, the heap-only tuple nevertheless still had its xmax
"logically frozen"). If nothing else, this leaves me with a very
squishy set of invariant conditions to work with when I go to verify
agreement with CLOG, MULTIXACT, and hint bits.

So, the question is: What is the exact set of invariant conditions
that I can check when I go to verify agreement between a heap relation
and the various SLRUs? In particular, at what precise XID-wise point
do CLOG and MULTIXACT stop reliably storing transaction status? And,
is there a different point for the xmax of tuples that happen to be
heap-only tuples?

Another important concern here following 20b65522 is: Why is it safe
to assume that nobody will ever call TransactionIdDidCommit() for
"logically frozen" heap-only tuples that are not at the end of their
HOT chain, and in so doing get a wrong answer? I can't find a rule
that implies that there is no dangerous ambiguity that's written down
anywhere. I *can* find a comment that suggests that it's wrong, though
-- the "N.B." comment at the top of heap_prepare_freeze_tuple().
(Granted, that comment doesn't seem to acknowledge the fact that the
caller does WAL-log as part of freezing; there has been some churn in
this area.)

I'm pretty sure that the classic rule for TransactionIdDidCommit()
when called from tqual.c was that the tuple cannot have
HEAP_XMIN_FROZEN set, which was straightforward enough back when a
frozen tuple was assumed to have necessarily committed (and to have a
"raw" xmin point that should logically be visible to everyone). This
business of "logically frozen" xmax values for dead heap-only tuples
makes things *way* more complicated, though. I'm concerned that we've
failed to account for that, and that TransactionIdDidCommit() calls
concerning pre-relfrozenxid XIDs can still happen.

I should point out that for whatever the reason, there is evidence
that we do in fact risk TransactionIdDidCommit() calls that give wrong
answers (independent of the ~2014 MULTIXACT bugs where that was
clearly the case, and even before 20b65522): Jeff Janes showed [2]
that there is probably some unaccounted-for case where "premature"
truncation takes place. This may be related to the recent HOT
chain/freezing bugs, and our (only partial [3]) fixes may have fixed
that, too -- I just don't know.

[1] https://commitfest.postgresql.org/15/1263/
[2] https://postgr.es/m/CAMkU=1y-TNP_7JdFg_ubXqDB8esMO280aCDGDwsa429HRtE8Lw@mail.gmail.com
[3] https://postgr.es/m/20171007232524.sdpi3jk2636fvjge@alvherre.pgsql
--
Peter Geoghegan

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabrízio de Royes Mello 2017-10-10 00:45:23 Re: search path security issue?
Previous Message Jing Wang 2017-10-09 23:57:26 Re: Support to COMMENT ON DATABASE CURRENT_DATABASE