Re: Amcheck: do rightlink verification with lock coupling

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Amcheck: do rightlink verification with lock coupling
Date: 2020-01-17 01:11:03
Message-ID: CAH2-Wz=Mz6-588AKMw5p0CpRq5a5SWMsC0QM+BOnLCjjvpqV9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 13, 2020 at 8:47 PM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
> > 11 янв. 2020 г., в 7:49, Peter Geoghegan <pg(at)bowt(dot)ie> написал(а):
> > I'm curious why Andrey's corruption problems were not detected by the
> > cross-page amcheck test, though. We compare the first non-pivot tuple
> > on the right sibling leaf page with the last one on the target page,
> > towards the end of bt_target_page_check() -- isn't that almost as good
> > as what you have here in practice? I probably would have added
> > something like this myself earlier, if I had reason to think that
> > verification would be a lot more effective that way.
>
> We were dealing with corruption caused by lost page update. Consider two pages:
> A->B
> If A is split into A` and C we get:
> A`->C->B
> But if update of A is lost we still have
> A->B, but B backward pointers points to C.
> B's smallest key is bigger than hikey of A`, but this do not violate
> cross-pages invariant.
>
> Page updates may be lost due to bug in backup software with incremental
> backups, bug in storage layer of Aurora-style system, bug in page cache, incorrect
> fsync error handling, bug in ssd firmware etc. And our data checksums do not
> detect this kind of corruption. BTW I think that it would be better if our
> checksums were not stored on a page itseft, they could detect this kind of faults.

I find this argument convincing. I'll try to get this committed soon.

While you could have used bt_index_parent_check() or heapallindexed to
detect the issue, those two options are a lot more expensive (plus the
former option won't work on a standby). Relaxing the principle that
says that we shouldn't hold multiple buffer locks concurrently doesn't
seem like that much to ask for to get such a clear benefit.

I think that this is safe, but page deletion/half-dead pages need more
thought. In general, the target page could have become "ignorable"
when rechecked.

> We were able to timely detect all those problems on primaries in our testing
> environment. But much later found out that some standbys were corrupted,
> the problem appeared only when they were promoted.
> Also, in nearby thread Grygory Rylov (g0djan) is trying to enable one more
> invariant in standby checks.

I looked at that thread just now, but Grygory didn't say why this true
root check was particularly important, so I can't see much upside.
Plus that seems riskier than what you have in mind here.

Does it have something to do with the true root looking like a deleted
page? The details matter.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-01-17 01:21:29 Re: pgindent && weirdness
Previous Message Michael Paquier 2020-01-17 01:09:54 Re: Setting min/max TLS protocol in clientside libpq