Re: [HACKERS] A design for amcheck heapam verification

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] A design for amcheck heapam verification
Date: 2018-03-29 00:49:10
Message-ID: CAH2-Wz=+Arq6subo=sMDsx8zWefzqfMn6CWgpNdyWyoxPJWikw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 28, 2018 at 5:04 AM, Pavan Deolasee
<pavan(dot)deolasee(at)gmail(dot)com> wrote:
> Hmm Ok. It still sounds backwards to me, but then English is not my first or
> even second language. "A heap scan later verifies the presence in the heap
> of all index tuples fingerprinted" sounds as if we expect to find all
> fingerprinted index tuples in the heap. But in reality, we check if all heap
> tuples have a fingerprinted index tuple.

Why don't we leave this to the committer that picks the patch up? I
don't actually mind very much. I suspect that I am too close to the
problem to be sure that I've explained it in the clearest possible
way.

>> You're right - there is a narrow window for REPEATABLE READ and
>> SERIALIZABLE transactions. This is a regression in v6, the version
>> removed the TransactionXmin test.
>>
>> I am tempted to fix this by calling GetLatestSnapshot() instead of
>> GetTransactionSnapshot(). However, that has a problem of its own -- it
>> won't work in parallel mode, and we're dealing with parallel
>> restricted functions, not parallel unsafe functions. I don't want to
>> change that to fix such a narrow issue. IMV, a better fix is to treat
>> this as a serialization failure. Attached patch, which applies on top
>> of v7, shows what I mean.
>
>
> Ok. I am happy with that.

Cool.

> Well pg_index entry will be visible and should be visible. Otherwise how do
> you even maintain a newly created index. I am not sure, but I guess we take
> fresh MVCC snapshots for catalog searches, even in RR transactions.

True, but that isn't what would happen with an SQL query that queries
the system catalogs. That only applies to how the system catalogs are
accessed internally, not how they'd almost certainly be accessed when
using amcheck.

>> > Are there any interesting cases around
>> > INSERT_IN_PROGRESS/DELETE_IN_PROGRESS
>> > tuples, especially if those tuples were inserted/deleted by our own
>> > transaction? It probably worth thinking.
>
>
> Anything here that you would like to check? I understand that you may see
> such tuples only for catalog tables.

I think that the WARNING ought to be fine. We shouldn't ever see it,
but if we do it's probably due to a bug in an extension or something.
It doesn't seem particularly likely that someone could ever insert
into the table concurrently despite our having a ShareLock on the
relation. Even with corruption.

>> IndexBuildHeapRangeScan() doesn't mention anything about CIC's heap
>> ShareUpdateExclusiveLock (it just says SharedLock), because that lock
>> strength doesn't have anything to do with IndexBuildHeapRangeScan()
>> when it operates with an MVCC snapshot. I think that this means that
>> this patch doesn't need to update comments within
>> IndexBuildHeapRangeScan(). Perhaps that's a good idea, but it seems
>> independent.
>
>
> Ok, I agree. But note that we are now invoking that code with
> AccessShareLock() whereas the existing callers either use ShareLock or
> ShareUpdateExclusiveLock. That's probably does not matter, but it's a change
> worth noting.

Fair point, even though the ShareUpdateExclusiveLock case isn't
actually acknowledged by IndexBuildHeapRangeScan(). Can we leave this
one up to the committer, too? I find it very hard to argue either for
or against this, and I want to avoid "analysis paralysis" at this
important time.

> While that's true, I am thinking if there is anything we can do to stop this
> a consistency-checking utility to create other non-trivial side effects on
> the state of the database, even if that means we can not check all heap
> tuples. For example, can there be a way by which we allow concurrent vacuum
> or HOT prune to continue to prune away dead tuples, even if that means
> running bt_check_index() is some specialised way (such as not allowing in a
> transaction block) and the heap scan might miss out some tuples? I don't
> know answer to that, but given that sometimes bt_check_index() may take
> hours if not days to finish, it seems worth thinking or at least documenting
> the side-effects somewhere.

That seems like a worthwhile goal for a heap checker that only checks
the structure of the heap, rather than something that checks the
consistency of an index against its table. Especially one that can run
without a connection to the database, for things like backup tools,
where performance is really important. There is certainly room for
that. For this particular enhancement, the similarity to CREATE INDEX
seems like an asset.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-03-29 00:50:21 Re: Changing WAL Header to reduce contention during ReserveXLogInsertLocation()
Previous Message Michael Paquier 2018-03-29 00:39:44 Re: [HACKERS] Pluggable storage