Re: [HACKERS] A design for amcheck heapam verification

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] A design for amcheck heapam verification
Date: 2018-02-05 20:55:20
Message-ID: CAH2-WzmDMgaf-T0adxAHVra2y3w5ubmWPStGL-UPnGx9+TT5CA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 22, 2018 at 6:07 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> Yep. I have provided the feedback I wanted for 0001 (no API change in
> the bloom facility by the way :( ), but I still wanted to look at 0002
> in depths.

I don't see a point in adding complexity that no caller will actually
use. It *might* become useful in the future, in which case it's no
trouble at all to come up with an alternative initialization routine.

Anyway, parallel CREATE INDEX added a new "scan" argument to
IndexBuildHeapScan(), which caused this patch to bitrot. At a minimum,
an additional NULL argument should be passed by amcheck. However, I
have a better idea.

ISTM that verify_nbtree.c should manage the heap scan itself, it the
style of parallel CREATE INDEX. It can acquire its own MVCC snapshot
for bt_index_check() (which pretends to be a CREATE INDEX
CONCURRENTLY). There can be an MVCC snapshot acquired per index
verified, a snapshot that is under the direct control of amcheck. The
snapshot would be acquired at the start of verification on an index
(when "heapallindex = true"), before the verification of the index
structure even begins, and released at the very end of verification.
Advantages of this include:

1. It simplifies the code, and in particular lets us remove the use of
TransactionXmin. Comments already say that this TransactionXmin
business is a way of approximating our own MVCC snapshot acquisition
-- why not *actually do* the MVCC snapshot acquisition, now that
that's possible?

2. It makes the code for bt_index_check() and bt_index_parent_check()
essentially identical, except for varying
IndexBuildHeapScan()/indexInfo parameters to match what CREATE INDEX
itself does. The more we can outsource to IndexBuildHeapScan(), the
better.

3. It ensures that transactions that heapallindexed verify many
indexes in one go are at no real disadvantage to transactions that
heapallindexed verify a single index, since TransactionXmin going
stale won't impact verification (we won't have to skip Bloom filter
probes due to the uncertainty about what should be in the Bloom
filter).

4. It will make parallel verification easier in the future, which is
something that we ought to make happen. Parallel verification would
target a table with multiple indexes, and use a parallel heap scan. It
actually looks like making this work would be fairly easy. We'd only
need to copy code from nbtsort.c, and arrange for parallel workers to
verify an index each ahead of the heap scan. (There would be multiple
Bloom filters in shared memory, all of which parallel workers end up
probing.)

Thoughts?

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-02-05 20:59:48 Re: WIP: BRIN multi-range indexes
Previous Message Andrey Borodin 2018-02-05 20:36:52 Re: [WIP PATCH] Index scan offset optimisation using visibility map