Re: Connections hang indefinitely while taking a gin index's LWLock buffer_content lock(PG10.7)

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: chenhj <chjischj(at)163(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Connections hang indefinitely while taking a gin index's LWLock buffer_content lock(PG10.7)
Date: 2019-11-17 18:18:43
Message-ID: CAPpHfdvtwJbPSRWMfhXfvd4ZMb7GCxc77Su=u9WdTobWZEpvgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 11, 2019 at 2:42 AM Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> I'm sorry for late reply. I was busy with various things. Also
> digging into these details took some time. Please find my explanation
> below.
>
> On Wed, Oct 30, 2019 at 2:34 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > In general, it seems very important to be clear about exactly how the
> > key space works. The nbtree work for v12 greatly benefitted from
> > defining comparisons in a way that didn't really change how nbtree
> > worked, while at the same time minimizing I/O and making nbtree
> > faithful to Lehman & Yao's original design. It isn't obvious how
> > valuable it is to really carefully define how invariants and key
> > comparisons work, but it seems possible to solve a lot of problems
> > that way.
>
> gin packs downlinks and pivot key into tuples in the different way
> than nbtree does. Let's see the logical structure of internal B-tree
> page. We can represent it as following.
>
> downlink_1, key_1_2, downlink_2, key_2_3, .... , downlink_n, highkey
>
> key_{i-1}_i is pivot key, which splits key space between
> downlink_{i-1} and downlink_i. So, every key under downlink_{i-1} is
> <= key_{i-1}_i. Every key under downlink_i is > key_{i-1}_i. And all
> underlying keys are <= highkey.
>
> nbtree packs that into tuples as followings (tuples are shown in parentheses).
>
> (highkey), (-inf, downlink_1), (key_1_2, downlink_2), ...
> (key_{n-1}_n, downlink_n)
>
> So, we store highkey separately. key_{i-1}_i and downlink_i form a
> tuple together. downlink_1 is coupled with -inf key.
>
> gin packs tuples in a different way.
>
> (downlink_1, key_1_2), (downlink_2, key_2_3), ... , (downlink_n, highkey)
>
> So, in GIN key_{i-1}_i and downlink_{i-1} form a tuple. Highkey is
> coupled with downlink_n. And -inf key is not needed here.
>
> But there is couple notes about highkey:
> 1) In entry tree rightmost page, a key coupled with downlink_n doesn't
> really matter. Highkey is assumed to be infinity.
> 2) In posting tree, a key coupled with downlink_n always doesn't
> matter. Highkey for non-rightmost pages is stored separately and
> accessed via GinDataPageGetRightBound().
>
> I think this explains following:
> 1) The invariants in gin btree are same as they are in nbtree. Just
> page layout is different.
> 2) The way entryLocateEntry() works. In particular why it's OK to go
> the mid downlink if corresponding key equals.
> 3) There is no "negative infinity" item in internal pages.
>
> Does the explanation of above looks clear for you? If so, I can put
> it into the README.

So, I've put this explanation into README patch. I just change
designation to better match with Lehman & Yao paper and did some minor
enchantments.

I'm going to push this patchset if no objections.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0002-Fix-traversing-to-the-deleted-GIN-page-via-downlin-4.patch application/octet-stream 3.1 KB
0001-Fix-deadlock-between-ginDeletePage-and-ginStepRigh-4.patch application/octet-stream 11.3 KB
0003-Revise-GIN-README-4.patch application/octet-stream 12.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2019-11-17 18:24:08 Reverse collations (initially for making keyset pagination cover more cases)
Previous Message Magnus Hagander 2019-11-17 13:38:21 Re: global / super barriers (for checksums)