Re: amcheck verification for GiST

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: amcheck verification for GiST
Date: 2019-02-24 12:37:09
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers


Thanks for this detailed review!

> * Note that this "parent then child" lock order seems to not be
> consistent with the general rule for holding concurrent buffer locks
> that is described in the GiST README:

This is correct. I've changed locking order.
When we check target internal page, we make a palloc'ed copy and unlock it (but hold the pin).
If we find discrepancy between parent and child keys we lock parent page again.
Then look for correct downlink. And check keys again.
In case when discrepancy still present we report an error.
Otherwise drop parent lock.

> * I think that this should be an error:
>> + if (GistPageIsLeaf(page))
>> + {
>> + /* should never happen unless it is root */
>> + Assert(stack->blkno == GIST_ROOT_BLKNO);
>> + }
Done. If this happens, this looks like a programming error or page header flags were corrupted
concurrently. Let's just report.

> * Couldn't leaf-like root pages also get some of the testing that we
> do for other pages within check_index_tuple()? Ideally it wouldn't be
> too much of a special case, though.

> * I think that you need to add an


> * You still hold multiple buffer locks at once, starting with the
> parent and moving to the child. Only 2 buffer locks. Why is this
> necessary, given that you now hold a ShareLock on both heap and index
> relations? Couldn't you just copy the parent into local memory, in the
> style of verify_nbtree.c?

> * The main point of entry for GiST verification is
> gist_check_parent_keys_consistency(). It would be nice to have
> comments that describe what it does, and in what order. I understand
> that English is not your first language, which makes it harder, but I
> would appreciate it if you made the effort to explain the theory. I
> don't want to assume that I understand your intent -- I could be
> wrong.

I've added about 10 lines of comments.

> * Suggest that you use function prototypes consistently, even for
> static functions. That is the style that we prefer. Also, please make
> the comments consistent with project guidelines on indentation and
> style.

> * It would be nice to know if the code in
> gist_check_parent_keys_consistency() finds problems in the parent, or
> in the child. The nbtree check has an idea of a "target" page: every
> page gets to be the target exactly once, and we only ever find
> problems in the current target. Maybe that is arbitrary in some cases,
> because the relationship between the two is where the problem actually
> is. I still think that it is a good idea to make it work that way if
> possible. It makes it easier to describe complicated relationships in
> comments.
the "target" for gist_check_parent_keys_consistency() is tuples on internal
page and their relation with tuples on referenced page. All other checks
are just additional checks, while I expect that this parent-child relationship
may contain some kind of bug: we had observed that GiST could not find some
tuples after CREATE INDEX CONCURRENTLY, but could not reliably reproduce the
problem before it was gone. That's why I've started this work.

> * Why does it make sense to use range_gist_picksplit()/operator class
> "union" support function to verify parent/child relationships? Does it
> handle NULL values correctly, given the special rules?
Yes, union handles NULLs. I do not use range_gist_picksplit().
By definition parent tuple is union of all child tuples.

> * Why not use the "consistent" support function instead of the "union"
> support function? Could you write new code that was based on
> gistindex_keytest() to do this?
Initially, I've used the consistency function. But it answers the question
"Does the downlinked key-space overlap with query". And query may be "not a
given key". Consistency function is controlled by search strategy. Different
operator classes support different set of strategies. But every opclass
should support union to build a GiST. So, I've used "union" instead of

> * The GiST docs ("63.3. Extensibility") say of the "distance" support
> function: "For an internal tree node, the distance returned must not
> be greater than the distance to any of the child nodes". Is there an
> opportunity to test this relationship, too, by making sure that
> distance is sane? Perhaps this is a very naive question -- I am not a
> GiST expert.
If parent tuple is correctly adjusted by child tuples, distance between
them must be 0. With this check We will test opclass, not an index structure.

> * Is it right that gist_check_internal_page() should both return a
> value that says "this internal page has child pages that are
> themselves internal", while testing the child pages?

> * Do we need to worry about F_DELETED pages? Why or why not?
Currently, GiST scan does not check for F_DELETED: there simply is no code
to delete a page in GiST (except my patch on commitfest).
I suspect that there may be deleted pages from previous versions.
But encountering this in a search should not be a problem. Thus, I copy
behavior from index scan: do not complain about deleted pages.

> * Do we need to worry about F_HAS_GARBAGE pages? Why or why not?
This flag is for microvacuum and only hints that there may be some vacuumable
tuples on page. It is used during check for neccesary split to allocate some space.


Best regards, Andrey Borodin.

Attachment Content-Type Size
0001-GiST-verification-function-for-amcheck-v5.patch application/octet-stream 21.6 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Hans Buschmann 2019-02-24 14:04:09 AW: BUG #15641: Autoprewarm worker fails to start on Windows with huge pages in use Old PostgreSQL community/pgsql-bugs x
Previous Message Alexander Korotkov 2019-02-24 12:34:22 Re: jsonpath