Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Maxim Orlov <m(dot)orlov(at)postgrespro(dot)ru>
Subject: Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
Date: 2021-12-20 15:37:35
Message-ID: CALT9ZEF=1q-mJiHV6E9e0cQmOYur0vtSX387KLZGs5U3me7vxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> >> I completely agree that checking uniqueness requires looking at the
> heap, but I don't agree that every caller of bt_index_check on an index
> wants that particular check to be performed. There are multiple ways in
> which an index might be corrupt, and Peter wrote the code to only check
> some of them by default, with options to expand the checks to other
> things. This is why heapallindexed is optional. If you don't want to pay
> the price of checking all entries in the heap against the btree, you don't
> have to.
> >>
> >> I've got the idea and revised the patch accordingly. Thanks!
> >> Pfa v4 of a patch. I've added an optional argument to allow uniqueness
> checks for the unique indexes.
> >> Also, I added a test variant to make them work on 32-bit systems.
> Unfortunately, converting the regression test to TAP would be a pain for
> me. Hope it can be used now as a 2-variant regression test for 32 and 64
> bit systems.
> >>
> >> Thank you for your consideration!
> >>
> >> --
> >> Best regards,
> >> Pavel Borisov
> >>
> >> Postgres Professional: http://postgrespro.com
> >> <v4-0001-Make-amcheck-checking-UNIQUE-constraint-for-btree.patch>
> >
> > Looking over v4, here are my review comments...
>

Mark and Peter, big thanks for your ideas!

I had little time to work on this feature until recently, but finally, I've
elaborated v5 patch (PFA)
It contains the following improvements, most of which are based on your
consideration:

- Amcheck tests are reworked into TAP-tests with "break the warranty" by
comparison function changes in the opclass instead of pg_index update.
Mark, again thanks for the sample!
- Added new --checkunique option into pg_amcheck.
- Added documentation and tests for amcheck and pg_amcheck new functions.
- Results are output at ERROR log level like it is done in the other
amcheck tests.
- Rare case of inability to check due to the first entry on a leaf page
being both: (1) equal to the last one on the previous page and (2) deleted
in the heap, is demoted to DEBUG1 log level. In this, I folowed Peter's
consideration that amcheck should do its best to check, but can not always
verify everything. The case is expected to be extremely rare.
- Made pages connectivity based on btpo_next (corrected a bug in the code,
I found meanwhile)
- If snapshot is already taken in heapallindexed case, reuse it for unique
constraint check.

The patch is pgindented and rebased on the current PG master code.
I'd like to re-attach the patch v5 to the upcoming CF if you don't mind.

I also want to add that some customers face index uniqueness
violations more often than I've expected, and I hope this check could also
help some other PostgreSQL customers.

Your further considerations are welcome as always!
--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

Attachment Content-Type Size
v5-0001-Add-option-for-amcheck-and-pg_amcheck-to-check-un.patch application/octet-stream 29.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-12-20 16:25:55 Re: sqlsmith: ERROR: XX000: bogus varno: 2
Previous Message Alvaro Herrera 2021-12-20 15:36:15 Re: psql format output