Re: Amcheck verification of GiST and GIN

From: Andrey Borodin <amborodin86(at)gmail(dot)com>
To: Jose Arthur Benetasso Villanova <jose(dot)arthur(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Amcheck verification of GiST and GIN
Date: 2022-11-27 21:29:18
Message-ID: CAAhFRxh+QgZ=DJ6L6A_mEFtB9s6r8i-nBW9XZepqRQ9v5C-yUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello!

Thank you for the review!

On Thu, Nov 24, 2022 at 6:04 PM Jose Arthur Benetasso Villanova
<jose(dot)arthur(at)gmail(dot)com> wrote:
>
> It compiled with those 2 warnings:
>
> verify_gin.c: In function 'gin_check_parent_keys_consistency':
> verify_gin.c:481:38: warning: declaration of 'maxoff' shadows a previous
> local [-Wshadow=compatible-local]
> 481 | OffsetNumber maxoff =
> PageGetMaxOffsetNumber(page);
> | ^~~~~~
> verify_gin.c:453:41: note: shadowed declaration is here
> 453 | maxoff;
> | ^~~~~~
> verify_gin.c:423:25: warning: unused variable 'heapallindexed'
> [-Wunused-variable]

Fixed.

> 423 | bool heapallindexed = *((bool*)callback_state);
> | ^~~~~~~~~~~~~~
>

This one is in progress yet, heapallindexed check is not implemented yet...

>
> Also, I'm not sure about postgres' headers conventions, inside amcheck.h,
> there is "miscadmin.h" included, and inside verify_gin.c, verify_gist.h
> and verify_nbtree.c both amcheck.h and miscadmin.h are included.
Fixed.

>
> About the documentation, the bt_index_parent_check has comments about the
> ShareLock and "SET client_min_messages = DEBUG1;", and both
> gist_index_parent_check and gin_index_parent_check lack it. verify_gin
> uses DEBUG3, I'm not sure if it is on purpose, but it would be nice to
> document it or put DEBUG1 to be consistent.
GiST and GIN verifications do not take ShareLock for parent checks.
B-tree check cannot verify cross-level invariants between levels when
the index is changing.

GiST verification checks only one invariant that can be verified if
page locks acquired the same way as page split does.
GIN does not require ShareLock because it does not check cross-level invariants.

Reporting progress with DEBUG1 is a good idea, I did not know that
this feature exists. I'll implement something similar in following
versions.

> I did the following test:

Cool! Thank you!

>
> There are more code paths to follow to check the entire code, and I had a
> hard time to corrupt the indices. Is there any automated code to corrupt
> index to test such code?
>

Heapam tests do this in an automated way, look into this file
t/001_verify_heapam.pl.
Surely we can write these tests. At least automate what you have just
done in the review. However, committing similar checks is a very
tedious work: something will inevitably turn buildfarm red as a
watermelon.

I hope I'll post a version with DEBUG1 reporting and heapallindexed soon.
PFA current state.
Thank you for looking into this!

Best regards, Andrey Borodin.

Attachment Content-Type Size
v16-0001-Refactor-amcheck-to-extract-common-locking-routi.patch application/octet-stream 28.3 KB
v16-0003-Add-gin_index_parent_check-to-verify-GIN-index.patch application/octet-stream 32.5 KB
v16-0002-Add-gist_index_parent_check-function-to-verify-G.patch application/octet-stream 26.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-11-27 23:34:34 Re: O(n) tasks cause lengthy startups and checkpoints
Previous Message Justin Pryzby 2022-11-27 20:15:12 Re: ALTER TABLE uses a bistate but not for toast tables