Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Dmitry Ivanov <d(dot)ivanov(at)postgrespro(dot)ru>
Cc: Shubham Barai <shubhambaraiss(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Borodin <amborodin86(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)gmail(dot)com>
Subject: Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)
Date: 2018-04-09 10:33:23
Message-ID: 0b3ad2c2-2692-62a9-3a04-5724f2af9114@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-www

On 28/03/18 19:53, Teodor Sigaev wrote:
> Hi!
>
> I slightly modified test for clean demonstration of difference between
> fastupdate on and off. Also I added CheckForSerializableConflictIn() to
> fastupdate off codepath, but only in case of non-empty pending list.
>
> The next question what I see: why do not we lock entry leaf pages in some cases?

Why should we?

> As I understand, scan should lock any visited page, but now it's true only for
> posting tree. Seems, it also should lock pages in entry tree because concurrent
> procesess could add new entries which could be matched by partial search, for
> example. BTW, partial search (see collectMatchBitmap()) locks correctly entry
> tree, but regular startScanEntry() doesn't lock entry page in case of posting
> tree, only in case of posting list.
I think this needs some high-level comments or README to explain how the
locking works. It seems pretty ad hoc at the moment. And incorrect.

1. Why do we lock all posting tree pages, even though they all represent
the same value? Isn't it enough to lock the root of the posting tree?

2. Why do we lock any posting tree pages at all, if we lock the entry
tree page anyway? Isn't the lock on the entry tree page sufficient to
cover the key value?

3. Why do we *not* lock the entry leaf page, if there is no match? We
still need a lock to remember that we probed for that value and there
was no match, so that we conflict with a tuple that might be inserted later.

At least #3 is a bug. The attached patch adds an isolation test that
demonstrates it. #1 and #2 are weird, and cause unnecessary locking, so
I think we should fix those too, even if they don't lead to incorrect
results.

Remember, the purpose of predicate locks is to lock key ranges, not
physical pages or tuples in the index. We use leaf pages as handy
shortcut for "any key value that would belong on this page", but it is
just an implementation detail.

I took a stab at fixing those issues, as well as the bug when fastupdate
is turned on concurrently. Does the attached patch look sane to you?

- Heikki

Attachment Content-Type Size
0001-Re-think-predicate-locking-on-GIN-indexes.patch text/x-patch 34.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2018-04-09 10:37:33 Re: Optimizing nested ConvertRowtypeExpr execution
Previous Message Kyotaro HORIGUCHI 2018-04-09 10:26:27 Fixing a trivial typo in comment in rewriteManip.c

Browse pgsql-www by date

  From Date Subject
Next Message Teodor Sigaev 2018-04-09 11:35:05 Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)
Previous Message Liaqat Andrabi 2018-04-09 08:56:42 Edit access to PostgreSQL Ecosystem Wiki Page