Re: HOT chain validation in verify_heapam()

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Himanshu Upadhyaya <upadhyaya(dot)himanshu(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: HOT chain validation in verify_heapam()
Date: 2022-11-14 22:13:10
Message-ID: CAH2-Wz=j_5nrVXF666prXr_ntEH1n_H2U5sqz0xu4fz46yMBjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 14, 2022 at 12:58 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I wonder if we ought to add an error check to heap_prepare_freeze_tuple()
> against this scenario. We're working towards being more aggressive around
> freezing, which will make it more likely to hit corner cases around this.

In theory my work on freezing doesn't change the basic rules about how
freezing works, and doesn't know anything about HOT, so it shouldn't
introduce any new risk. Even still, I agree that this seems like
something to do in the scope of the same work, just in case. Plus it's
just important.

It would be possible to have exhaustive heap_prepare_freeze_tuple
checks in assert-only builds -- we can exhaustively check the final
array of prepared freeze plans that we collected for a given heap
page, and check it against the page exhaustively right before freezing
is executed. That's not perfect, but it would be a big improvement.

Right now I am not entirely sure what I would need to check in such a
mechanism. I am legitimately unsure of what the rules are in light of
this new information.

> Which'd still be fine if the whole chain were already "fully dead". One way I
> think this can happen is <= PG 13's HEAPTUPLE_DEAD handling in
> lazy_scan_heap().

You mean the tupgone thing? Perhaps it would have avoided this
particular problem, or one like it. But it had so many other problems
that I don't see why it matters now.

> At least as long as correctness requires not ending up in endless loops -
> indeed. We end up with lazy_scan_prune() endlessly retrying. Without a chance
> to interrupt. Shouldn't there at least be a CFI somewhere?

Probably, but that wouldn't change the fact that it's a bug when this
happens. Obviously it's more important to avoid such a bug than it is
to ameliorate it.

> I think the problem partially is that the proposed verify_heapam() code is too
> "aggressive" considering things to be part of the same hot chain - which then
> means we have to be very careful about erroring out.
>
> The attached isolationtester test triggers:
> "unfrozen tuple was updated to produce a tuple at offset %u which is frozen"
> "updated version at offset 3 is also the updated version of tuple at offset %u"
>
> Despite there afaict not being any corruption. Worth noting that this happens
> regardless of hot/non-hot updates being used (uncomment s3ci to see).

Why don't you think that there is corruption?

The terminology here is tricky. It's possible that the amcheck patch
makes a very good point here, even without necessarily complaining
about a state that leads to obviously wrong behavior. It's also
possible that there really is wrong behavior, at least in my mind -- I
don't know what your remarks about no corruption are really based on.

I feel like I'm repeating myself more than I should, but: why isn't it
as simple as "HOT chain traversal logic is broken by frozen xmin in
the obvious way, therefore all bets are off"? Maybe you're right about
the proposed new functionality getting things wrong with your
adversarial isolation test, but I seem to have missed the underlying
argument. Are you just talking about regular update chains here, not
HOT chains? Something else?

> One important protection right now is that vacuumlazy.c uses a more
> pessimistic horizon than pruneheap.c. Even if visibility determinations within
> pruning recompute the horizon, vacuumlazy.c won't freeze based on the advanced
> horizon. I don't quite know where we we'd best put a comment with a warning
> about this fact.

We already have comments discussing the relationship between
OldestXmin and vistest (as well as rel_pages) in heap_vacuum_rel().
That seems like the obvious place to put something like this, at least
to me.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-11-14 22:33:07 Re: HOT chain validation in verify_heapam()
Previous Message Andres Freund 2022-11-14 22:11:43 Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment