Re: A design for amcheck heapam verification

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A design for amcheck heapam verification
Date: 2017-10-06 02:00:13
Message-ID: CAH2-WzkwnsPSiYBpa=M4LoZLpvD3zKvL1RLUWoqi_ifyfMMYDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 29, 2017 at 10:54 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>> Something that allocates new memory as the patch's bloom_init()
>> function does I'd tend to call 'make' or 'create' or 'new' or
>> something, rather than 'init'.
>
> I tend to agree. I'll adopt that style in the next version. I just
> didn't want the caller to have to manage the memory themselves.

v3 of the patch series, attached, does it that way -- it adds a
bloom_create(). The new bloom_create() function still allocates its
own memory, but does so while using a FLEXIBLE_ARRAY_MEMBER. A
separate bloom_init() function (that works with dynamic shared memory)
could easily be added later, for the benefit of parallel hash join.

Other notable changes:

* We now support bloom filters that have bitsets of up to 512MB in
size. The previous limit was 128MB.

* We now use TransactionXmin for our AccessShareLock xmin cutoff,
rather than calling GetOldestXmin(). This is the same cut-off used by
xacts that must avoid broken hot chains for their earliest snapshot.
This avoids a scan of the proc array, and allows more thorough
verification, since GetOldestXmin() was overly restrictive here.

* Expanded code comments describing the kinds of problems the new
verification capability is expected to be good at catching.

For example, there is now a passing reference to the CREATE INDEX
CONCURRENTLY bug that was fixed back in February (it's given as an
example of a more general problem -- faulty HOT safety assessment).
With the new heapallindexed enhancement added by this patch, amcheck
can be expected to catch that issue much of the time. We also go into
heap-only tuple handling within IndexBuildHeapScan(). The way that
CREATE INDEX tries to index the most recent tuple in a HOT chain
(while locating the root tuple in the chain, to get the right heap TID
for the index) has proven to be very useful as a smoke test while
investigating HOT/VACUUM FREEZE bugs in the past couple of weeks [1].
I believe it would have caught several historic MultiXact/recovery
bugs, too. This all seems worth noting explicitly.

[1] https://postgr.es/m/CAH2-Wznm4rCrhFAiwKPWTpEw2bXDtgROZK7jWWGucXeH3D1fmA@mail.gmail.com
--
Peter Geoghegan

Attachment Content-Type Size
0002-Add-amcheck-verification-of-indexes-against-heap.patch text/x-patch 36.0 KB
0001-Add-Bloom-filter-data-structure-implementation.patch text/x-patch 12.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-10-06 02:20:05 valgrind complains about WaitEventSetWaitBlock on HEAD (fe9ba28e)
Previous Message Amit Langote 2017-10-06 01:59:41 Re: v10 bottom-listed