|From:||Andrey Borodin <x4mmm(at)yandex-team(dot)ru>|
|To:||Peter Geoghegan <pg(at)bowt(dot)ie>|
|Cc:||Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||Re: amcheck verification for GiST|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
Sorry for the delay. Here's new version.
> 1 февр. 2019 г., в 4:58, Peter Geoghegan <pg(at)bowt(dot)ie> написал(а):
> On Tue, Jan 1, 2019 at 2:11 AM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>>> This isn't consistent with what you do within verify_nbtree.c, which
>>> deliberately avoids ever holding more than a single buffer lock at a
>>> time, on general principle. That isn't necessarily a reason why you
>>> have to do the same, but it's not clear why you do things that way.
>>> Why isn't it enough to have a ShareLock on the relation? Maybe this is
>>> a sign that it would be a good idea to always operate on a palloc()'d
>>> copy of the page, by introducing something equivalent to
>>> palloc_btree_page(). (That would also be an opportunity to do very
>>> basic checks on every page.)
>> If we unlock parent page before checking child, some insert can adjust tuple on parent, sneak into child and insert new tuple.
>> This can trigger false positive. I'll think about it more.
> I think that holding a buffer lock on an internal pages for as long as
> it takes to check all of the child pages is a non-starter. If you
> can't think of a way of not doing that that's race free with a
> relation-level AccessShareLock, then a relation-level ShareLock (which
> will block VACUUM) seems necessary.
> I think that you should duplicate some of what's in
> bt_index_check_internal() -- lock the heap table as well as the index,
> to avoid deadlocks. We might not do this with contrib/pageinspect, but
> that's not really intended to be used in production in the same way
> this will be.
I've extracted functions amcheck_lock_relation() and amcheck_unlock_relation().
With this patch a little bit complicated, but I do not think that code duplication will be OK here.
Instead of lock\unlock functions I can provide one-function-interface receiving index_checkable() and index_check() callbacks.
>>> * Maybe gist_index_check() should be called gist_index_parent_check(),
>>> since it's rather like the existing verification function
>>> * Alternatively, you could find a way to make your function only need
>>> an AccessShareLock -- that would make gist_index_check() an
>>> appropriate name. That would probably require careful thought about
>> I've changed lock level to AccessShareLock. IMV scan is just as safe as regular GiST index scan.
>> There is my patch with VACUUM on CF, it can add deleted pages. I'll update one of these two patches accordingly, if other is committed.
> Maybe you should have renamed it to gist_index_parent_check() instead,
> and kept the ShareLock. I don't think that this business with buffer
> locks is acceptable. And it is mostly checking parent/child
> relationships. Right?
That's right. Semantically gist_index_parent_check() is correct name, let's use it.
> You're not really able to check GiST tuples against anything other
> than their parent, unlike with B-Tree (you can only do very simple
> things, like the IndexTupleSize()/lp_len cross-check). Naming the
> function gist_index_parent_check() seems like the right way to go,
> even if you could get away with an AccessShareLock (which I now tend
> to doubt). It was way too optimistic to suppose that there might be a
> clever way of locking down race conditions that allowed you to not
> couple buffer locks, but also use an AccessShareLock. After all, even
> the B-Tree amcheck code doesn't manage to do this when verifying
> parent/child relationships (it only does something clever with the
> sibling page cross-check).
That's true, we cannot avoid locking parent and child page simultaneously to check correctness of tuples.
Currently, we do not check index tuples against heap. Should we do this or leave this for another patch?
Best regards, Andrey Borodin.
|Next Message||Andrey Borodin||2019-02-17 10:33:12||Re: [WIP] CREATE SUBSCRIPTION with FOR TABLES clause (table filter)|
|Previous Message||Fabien COELHO||2019-02-17 08:51:59||Re: CPU costs of random_zipfian in pgbench|