Re: Verify true root on replicas with amcheck

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: godjan • <g0dj4n(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Verify true root on replicas with amcheck
Date: 2020-01-17 00:40:47
Message-ID: CAH2-WzkxnA39Rzad+M7DJOQ3Up=_7dTE-6jXmx+4mp+5v+KWzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 9, 2020 at 12:55 AM godjan • <g0dj4n(at)gmail(dot)com> wrote:
> Hi, we have trouble to detect true root corruptions on replicas. I made a patch for resolving it with the locking meta page and potential root page.

What do you mean by true root corruption? What is the cause of the
problem? What symptom does it have in your application?

While I was the one that wrote the existing !readonly/parent check for
the true root (a check which your patch makes work with the regular
bt_check_index() function), I wasn't thinking of any particular
corruption scenario at the time. I wrote the check simply because it
was easy to do so (with a heavyweight ShareLock on the index).

> I heard that amcheck has an invariant about locking no more than 1 page at a moment for avoiding deadlocks. Is there possible a deadlock situation?

This is a conservative principle that I came up with when I wrote the
original version of amcheck. It's not strictly necessary, but it
seemed like a good idea. It should be safe to "couple" buffer locks in
a way that matches the B-Tree code -- as long as it is thought through
very carefully. I am probably going to relax the rule for one specific
case soon -- see:

https://postgr.es/m/F7527087-6E95-4077-B964-D2CAFEF6224B@yandex-team.ru

Your patch looks like it gets it right (it won't deadlock with other
sessions that access the metapage), but I hesitate to commit it
without a strong justification. Acquiring multiple buffer locks
concurrently is worth avoiding wherever possible.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-01-17 01:09:54 Re: Setting min/max TLS protocol in clientside libpq
Previous Message Euler Taveira 2020-01-16 23:58:24 Re: row filtering for logical replication