Re: [HACKERS] A design for amcheck heapam verification

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] A design for amcheck heapam verification
Date: 2018-02-08 01:56:50
Message-ID: CAH2-WznhneJxZdkbGbqaxeb9i0HRMCiaBAL=xvE6wkCiVpHeGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 5, 2018 at 12:55 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> Anyway, parallel CREATE INDEX added a new "scan" argument to
> IndexBuildHeapScan(), which caused this patch to bitrot. At a minimum,
> an additional NULL argument should be passed by amcheck. However, I
> have a better idea.
>
> ISTM that verify_nbtree.c should manage the heap scan itself, it the
> style of parallel CREATE INDEX. It can acquire its own MVCC snapshot
> for bt_index_check() (which pretends to be a CREATE INDEX
> CONCURRENTLY). There can be an MVCC snapshot acquired per index
> verified, a snapshot that is under the direct control of amcheck. The
> snapshot would be acquired at the start of verification on an index
> (when "heapallindex = true"), before the verification of the index
> structure even begins, and released at the very end of verification.

Attached patch fixes the parallel index build bitrot in this way. This
is version 6 of the patch.

This approach resulted in a nice reduction in complexity:
bt_index_check() and bt_index_parent_check() heapallindexed
verification operations both work in exactly the same way now, except
that bt_index_check() imitates a CREATE INDEX CONCURRENTLY (to match
the heavyweight relation locks acquired). This doesn't really need to
be explained as a special case anymore; bt_index_parent_check() is
like an ordinary CREATE INDEX, without any additional "TransactionXmin
heap tuple xmin recheck" complication.

A further benefit is that this makes running bt_index_check() checks
against many indexes more thorough, and easier to reason about. Users
won't have to worry about TransactionXmin becoming very stale when
many indexes are verified within a single command.

I made the following additional, unrelated changes based on various feedback:

* Faster modulo operations.

Andrey Borodin suggested that I make k_hashes() do fewer modulo
operations. While I don't want to change the algorithm to make this
happen, the overhead has been reduced. Modulo operations are now
performed through bitwise AND operations, which is possible only
because the bitset size is always a power of two. Apparently this is a
fairly common optimization for Bloom filters that use (enhanced)
double-hashing; we might as well do it this way.

I've really just transcribed the enhanced double hashing pseudo-code
from the Georgia Tech/Dillinger & Manolios paper into C code, so no
real change there; bloomfilter.c's k_hashes() is still closely based
on "5.2 Enhanced Double Hashing" from that same paper. Experience
suggests that we ought to be very conservative about developing novel
hashing techniques. Paranoid, even.

* New reference to the modulo bias effect.

Michael Paquier wondered why the Bloom filter was always a
power-of-two, which this addresses. (Of course, the "modulo bitwise
AND" optimization I just mentioned is another reason to limit
ourselves to power-of-two bitset sizes, albeit a new one.)

* Removed sdbmhash().

Michael also wanted to know more about sdbmhash(), due to some general
concern about its quality. I realized that it is best to avoid adding
a new general-purpose hash function, whose sole purpose is to be
different to hash_any(), when I could instead use
hash_uint32_extended() to get two 32-bit values all at once. Robert
suggested this approach at one point, actually, but for some reason I
didn't follow up until now.

--
Peter Geoghegan

Attachment Content-Type Size
0001-Add-Bloom-filter-data-structure-implementation.patch text/x-patch 26.4 KB
0002-Add-amcheck-verification-of-indexes-against-heap.patch text/x-patch 34.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-02-08 02:29:34 Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
Previous Message Robert Haas 2018-02-08 01:40:41 Re: postgres_fdw: perform UPDATE/DELETE .. RETURNING on a join directly