Re: Concurrency bug in amcheck

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Concurrency bug in amcheck
Date: 2020-04-27 08:51:51
Message-ID: CAPpHfdtM+=2NRm6M+f61uNh28J9GUrPe9bubHE5k5C_kAUJ9qw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 22, 2020 at 7:47 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Tue, Apr 21, 2020 at 2:54 AM Alexander Korotkov
> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > Proposed fix is attached. Spotted by Konstantin Knizhnik,
> > reproduction case and fix from me.
>
> I wonder if we should fix btree_xlog_unlink_page() instead of amcheck.
> We already know that its failure to be totally consistent with the
> primary causes problems for backwards scans -- this problem can be
> fixed at the same time:
>
> https://postgr.es/m/CANtu0ohkR-evAWbpzJu54V8eCOtqjJyYp3PQ_SGoBTRGXWhWRw@mail.gmail.com
>
> We'd probably still use your patch for the backbranches if we went this way.
>
> What do you think?

I've skip through the thread. It seems to be quite independent issue
from this one. This issue is related to the fact that we leave some
items on deleted pages on primary, and on the same time we have no
items on deleted pages on standby. This inconsistency cause amcheck
pass normally on primary, but fail on standby. BackwardScan on
standby issue seems to be related solely on locking protocol and
btpo_prev, btpo_next pointers. It wasn't mention on that thread that
we might need hikeys on deleted pages.

Assuming it doesn't seem we actually need any items on deleted pages,
I can propose to delete them on primary as well. That would make
contents of primary and standby more consistent. What do you think?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-04-27 09:21:07 Re: [BUG] non archived WAL removed during production crash recovery
Previous Message Julien Rouhaud 2020-04-27 07:52:17 Re: WAL usage calculation patch