Re: Concurrency bug in amcheck

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Concurrency bug in amcheck
Date: 2020-04-28 01:04:15
Message-ID: CAH2-Wzk3K6o0EbieaGFHmqfYD-af86gm6yp1KHyL4WdOduMyWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 27, 2020 at 4:17 AM Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > Assuming it doesn't seem we actually need any items on deleted pages,
> > I can propose to delete them on primary as well. That would make
> > contents of primary and standby more consistent. What do you think?
>
> So, my proposal is following. Backpatch my fix upthread to 11. In
> master additionally make _bt_unlink_halfdead_page() remove page items
> on primary as well. Do you have any objections?

What I meant was that we might as well review the behavior of
_bt_unlink_halfdead_page() here, since we have to change it anyway.
But I think you're right: No matter what happens or doesn't happen to
_bt_unlink_halfdead_page(), the fact is that deleted pages may or may
not have a single remaining item (which was originally the "top
parent" item from the page at offset number P_HIKEY), now and forever.
We have to conservatively assume that it could be either state, now
and forever. That means that we definitely have to give up on the
check, per your patch. So, +1 for backpatching that back to 11.

(BTW, I don't think that this is a concurrency issue, except in the
sense that a test case that recreates the false positive is sensitive
to concurrency -- I imagine you agree with this.)

I like your idea of making the primary consistent with the REDO
routine on the master branch only. I wonder if that will make it
possible to change btree_mask() so that wal_consistency_checking can
check deleted pages as well. The contents of a deleted page's special
area do matter, and yet we don't currently verify that it matches (we
use mask_page_content() within btree_mask() for deleted pages, which
seems inappropriately broad). In particular, the left and right
sibling links should be consistent with the primary on a deleted page.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-04-28 02:08:57 Re: WAL usage calculation patch
Previous Message James Coleman 2020-04-28 01:04:09 Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays