Re: A design for amcheck heapam verification

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A design for amcheck heapam verification
Date: 2017-05-01 23:28:59
Message-ID: CAH2-Wz=JTEtU4n26LyHZGgW_X1u+KM2vgnXGTj28sKPW3WAfUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 1, 2017 at 2:10 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> Actually, I guess amcheck would need to use its own scan's snapshot
> xmin instead. This is true because it cares about visibility in a way
> that's "backwards" relative to existing code that tests something
> against RecentGlobalXmin. Is there any existing thing that works that
> way?

Looks like pg_visibility has a similar set of concerns, and so
sometimes calls GetOldestXmin() to "recompute" what it calls
OldestXmin (which I gather is like RecentGlobalXmin, but comes from
calling GetOldestXmin() at least once). This happens within
pg_visibility's collect_corrupt_items(). So, I could either follow
that approach, or, more conservatively, call GetOldestXmin()
immediately after each "amcheck whole index scan" finishes, for use
later on, when we go to the heap. Within the heap, we expect that any
committed tuple whose xmin precedes FooIndex.OldestXmin should be
present in that index's bloom filter. Of course, when there are
multiple indexes, we might only arrive at the heap much later. (I
guess we'd also want to check if the MVCC Snapshot's xmin preceded
FooIndex.OldestXmin, and set that as FooIndex.OldestXmin when that
happened to be the case.)

Anyone have an opinion on any of this? Offhand, I think that calling
GetOldestXmin() once per index when its "amcheck whole index scan"
finishes would be safe, and yet provide appreciably better test
coverage than only expecting things visible to our original MVCC
snapshot to be present in the index. I don't see a great reason to be
more aggressive and call GetOldestXmin() more often than once per
whole index scan, though.

--
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2017-05-01 23:53:03 Re: A design for amcheck heapam verification
Previous Message Neha Khatri 2017-05-01 23:26:14 Re: Description of create_singleton_array()