Re: amcheck's verify_heapam(), and HOT chain verification

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: amcheck's verify_heapam(), and HOT chain verification
Date: 2021-11-09 18:12:01
Message-ID: CAH2-Wz=-+vBm=fokanc-zg0UpnTwvNinjtJVGTExPFw4Ud34LQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 7, 2021 at 9:30 AM Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
> Yes, I am quite interested, though I will have to alternate between this work and the various patch sets that I've already submitted for this development cycle.

Great! We still have a lot of work to do with HOT chain level
invariants. I have seen very clear evidence of that in the last 24
hours:

https://postgr.es/m/CAH2-Wzma=Y3O+LRx2Wj_HwGbbbeNwr6FoJzXni8hxOMw55pcZg@mail.gmail.com

> I think we need a corruption generating framework that can be deployed on the buildfarm. What I have in mind is inspired by your comments about the "freeze the dead" bug.

I'm not sure that that's truly necessary. IMV the important thing is
that we formalize the invariants, and have tooling that can test them.
Maintaining tests that actually display specific broken behavior (as
opposed to its absence) seems like it might be quite a burden.

There are many specific ways that these invariants might break, but
the specifics shouldn't matter -- the invariants should cut through
that (to the extent that that's possible). The "freeze the dead" bug
is mostly useful as a way of framing the discussion (perhaps even in
code comments). I did this with a historic CREATE INDEX CONCURRENTLY
bug, in code comments for heapallindexed verification. Verification
using heapallindexed actually detected two more CIC bugs years later
-- not including the earlier fixes that didn't quite get everything
right (thinking of the prepared xact CIC bugs found using amcheck).
The invariants that heapallindexed tests generalize to many different
situations, including situations that I couldn't have possibly
anticipated with any kind of precision. Having the right high level
idea is what really mattered.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua Brindle 2021-11-09 18:59:51 Re: Support for NSS as a libpq TLS backend
Previous Message Stephen Frost 2021-11-09 17:43:20 Re: Commitfest 2021-11 Patch Triage - Part 2