Re: Improve search for missing parent downlinks in amcheck

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improve search for missing parent downlinks in amcheck
Date: 2019-08-18 22:15:19
Message-ID: CAPpHfduTWj00vPXTBfGd=apEiqyO3Hf+PEXgHC=6wDuJ8RBZHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 13, 2019 at 11:44 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > In this revision check for missing downlinks is combined with
> > bt_downlink_check(). So, pages visited by bt_downlink_check() patch
> > doesn't cause extra accessed. It only causes following additional
> > page accesses:
> > 1) Downlinks corresponding to "negative infinity" keys,
> > 2) Pages of child level, which are not referenced by downlinks.
> >
> > But I think these two kinds are very minority, and those accesses
> > could be trade off with more precise missing downlink check without
> > bloom filter. What do you think?
>
> I am generally in favor of making the downlink check absolutely
> reliable, and am not too worried about the modest additional overhead.
> After all, bt_index_parent_check() is supposed to be thorough though
> expensive. The only reason that I used a Bloom filter for
> fingerprinting downlink blocks was that it seemed important to quickly
> get amcheck coverage for subtle multi-level page deletion bugs just
> after v11 feature freeze. We can now come up with a better design for
> that.

Great!

> I was confused about how this patch worked at first. But then I
> remembered that Lehman and Yao consider downlinks to be distinct
> things to separator keys in internal pages. The high key of an
> internal page in the final separator key, so you have n downlinks and
> n + 1 separator keys per internal page -- two distinct things that
> appear in alternating order (the negative infinity item is not
> considered to have any separator key here). So, while internal page
> items are explicitly "(downlink, separator)" pairs that are grouped
> into a single tuple in nbtree, with a separate tuple just for the high
> key, Lehman and Yao would find it just as natural to treat them as
> "(separator, downlink)" pairs. You have to skip the first downlink on
> each internal level to make that work, but this makes our
> bt_downlink_check() check have something from the target page (child's
> parent page) that is like the high key in the child.
>
> It's already really confusing that we don't quite mean the same thing
> as Lehman and Yao when we say downlink (See also my long "why a
> highkey is never truly a copy of another item" comment block within
> bt_target_page_check()), and that is not your patch's fault. But maybe
> we need to fix that to make your patch easier to understand. (i.e.
> maybe we need to go over every use of the word "downlink" in nbtree,
> and make it say something more precise, to make everything less
> confusing.)

I agree that current terms nbtree use to describe downlinks and
separator keys may be confusing. I'll try to fix this and come up
with patch if succeed.

> Other feedback:
>
> * Did you miss the opportunity to verify that every high key matches
> its right sibling page's downlink tuple in parent page? We talked
> about this already, but you don't seem to match the key (you only
> match the downlink block).
>
> * You are decoupling the new check from the bt_index_parent_check()
> "heapallindexed" option. That's probably a good thing, but you need to
> update the sgml docs.
>
> * Didn't you forget to remove the BtreeCheckState.rightsplit flag?
>
> * You've significantly changed the behavior of bt_downlink_check() --
> I would note this in its header comments. This is where ~99% of the
> new work happens.
>
> * I don't like that you use the loaded term "target" here -- anything
> called "target" should be BtreeCheckState.target:
>
> > static void
> > -bt_downlink_missing_check(BtreeCheckState *state)
> > +bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
> > + BlockNumber targetblock, Page target)
> > {
>
> If it's unclear what I mean, this old comment should make it clearer:
>
> /*
> * State associated with verifying a B-Tree index
> *
> * target is the point of reference for a verification operation.
> *
> * Other B-Tree pages may be allocated, but those are always auxiliary (e.g.,
> * they are current target's child pages). Conceptually, problems are only
> * ever found in the current target page (or for a particular heap tuple during
> * heapallindexed verification). Each page found by verification's left/right,
> * top/bottom scan becomes the target exactly once.
> */

The revised patch seems to fix all of above.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
amcheck-btree-improve-missing-parent-downlinks-check-3.patch application/octet-stream 23.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2019-08-18 22:29:53 Re: Support for jsonpath .datetime() method
Previous Message Justin Pryzby 2019-08-18 19:35:33 Re: Zedstore - compressed in-core columnar storage