Re: Why does L&Y Blink Tree need lock coupling?

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Oliver Yang <olilent2ctw(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Why does L&Y Blink Tree need lock coupling?
Date: 2022-12-12 20:13:02
Message-ID: CAH2-Wz=OA1EcjLxjoDUpgVLvDg-A=WZ_zMQ+jzqwWsRQExO_6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 12, 2022 at 11:43 AM Oliver Yang <olilent2ctw(at)gmail(dot)com> wrote:
> In a sense, this isn't a fundamental issue and L&Y
> paper could be easily tweaked to track downlink so that it
> doesn't require coupling lock in moveright.

I suppose that's true, but having 3 locks at a time is such a rare
case for classic L&Y that it hardly matters (it's far rarer than
having only 2, which is itself very rare). They emphasized it because
it is the highest number of locks, not because it's necessarily
important.

I'm confused about whether you're talking about what L&Y could do, in
theory, or what nbtree could do, in theory, or what nbtree should
actually be doing. These are 3 different subjects, really.

> However, it's hard to see why coupling lock is needed during
> ascent from child level to parent level in the L&Y setting. What can
> go wrong if L&Y's algorithm releases lock on child page before
> acquiring lock on its parent? The correctness proof in L&Y
> doesn't use the assumption of coupling lock anywhere. It appears
> that a lock at a time is sufficient in principle.

That may be true, but the words "in principle" are doing a lot of work
for you here. Travelling at a speed that approaches the speed of light
is also possible in principle.

Imagine that there is no lock coupling at all. What happens when there
is a page split that doesn't complete its second phase due to a hard
crash? Do inserters complete the split in passing, like in nbtree? Now
you have to think about incomplete splits across multiple levels. That
gets very complicated, but it needs to be addressed in a real system
like Postgres. Academic papers can just ignore corner cases like this.

Why do you think that consistently only holding one lock (as opposed
to only holding one lock at a time during 99%+ of all inserts) is
truly valuable in a practical setting? Maybe it is valuable, but
that's rather unclear. It is not an area that I would choose to work
on, given that uncertainty.

> The direct quote from section 1.2 of Lanin & Shasha
> paper: "Although it is not apparent in itself, the B-link
> structure allows inserts and searches to lock only one node at a
> time." It seems to be an assertion on the property of the L&Y
> algorithm. It doesn't seem to be related the optimistic approach
> employed in Lanin & Shasha own algorithm.

I don't know how you can reach that conclusion. It directly
contradicts the claim made by the L&Y paper about requiring at most 3
locks. And they even say "although it's not apparent in itself",
presenting it as new information.

They seem to be saying that the same basic B-Link data structure (or
one like it, with the addition of outlinks) could do that -- but
that's not the same as the L&Y algorithm (the original design). That
does seem possible, though I doubt that it would be particularly
compelling, since L&Y/nbtree don't need to do lock coupling for the
vast majority of individual inserts or searches.

I don't think that the L&Y paper is particularly clear, or
particularly well written. It needs to be interpreted in its original
context, which is quite far removed from the current concerns of
nbtree. It's a 41 year old paper.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-12-12 21:01:36 Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX
Previous Message Nathan Bossart 2022-12-12 20:04:27 Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX