Re: HOT chain validation in verify_heapam()

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Himanshu Upadhyaya <upadhyaya(dot)himanshu(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: HOT chain validation in verify_heapam()
Date: 2022-11-14 21:20:49
Message-ID: CAH2-Wzmsa0yMS-JsP5_778VNG1VLAL5xO-EgxhLtBJ9KZ=gJmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 14, 2022 at 11:28 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Part of the motivation here is also driven by trying to figure out how
> to word the complaints. We have a dedicated field in the amcheck that
> can hold one tuple offset or the other, but if we're checking the
> relationships between tuples, what do we put there? I feel it will be
> easiest to understand if we put the offset of the older tuple in that
> field and then phrase the complaint as the patch does, e.g.:

That makes a lot of sense to me, and reminds me of how things work in
verify_nbtree.c.

At a high level verify_nbtree.c works by doing a breadth-first
traversal of the tree. The search makes each distinct page the "target
page" exactly once. The target page is the clear focal point for
everything -- almost every complaint about corruption frames the
problem as a problem in the target page. We consistently describe
things in terms of their relationship with the target page, so under
this scheme everybody is...on the same page (ahem).

Being very deliberate about that probably had some small downsides.
Maybe it would have made a little more sense to word certain
particular corruption report messages in a way that placed blame on
"ancillary" pages like sibling/child pages (not the target page) as
problems in the ancillary page itself, not the target page. This still
seems like the right trade-off -- the control flow can be broken up
into understandable parts once you understand that the target page is
the thing that we use to describe every other page.

> > I'm doubtful it's a good idea to try to validate the 9.4 case. The likelihood
> > of getting that right seems low and I don't see us gaining much by even trying.
>
> I agree with Peter. We have to try to get that case right. If we can
> eventually eliminate it as a valid case by some mechanism, hooray. But
> in the meantime we have to deal with it as best we can.

Practiced intellectual humility seems like the way to go here. On some
level I suspect that we'll have problems in exactly the places that we
don't look for them.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-11-14 22:02:52 Re: HOT chain validation in verify_heapam()
Previous Message Andres Freund 2022-11-14 21:17:28 Re: Add sub-transaction overflow status in pg_stat_activity