Re: Cleaning up nbtree after logical decoding on standby work

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Cleaning up nbtree after logical decoding on standby work
Date: 2023-05-26 17:28:58
Message-ID: CAH2-WzkBsXyp4Bo5aCTamXoYxULBkDGvpmqsi0uo+kcT_HQaFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 26, 2023 at 1:56 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> I suppose you're not thinking of applying this to current master, but
> instead just leave it for when pg17 opens, right? I mean, clearly it
> seems far too invasive to put it in after beta1.

I was planning on targeting 16 with this. Although I only posted a
patch recently, I complained about the problems in this area shortly
after the code first went in. It's fairly obvious to me that the
changes made to nbtree went quite a bit further than they needed to.
Admittedly that's partly because I'm an expert on this particular
code.

> On the other hand,
> it's painful to know that we're going to have code that exists only on
> 16 and not any other release, in an area that's likely to have bugs here
> and there, so we're going to need to heavily adjust backpatches for 16
> especially.

Right -- it's important to keep things reasonably consistent to ease
backpatching. Though I doubt that changes to nbtree itself will turn
out to be buggy -- with or without my patch. The changes to nbtree
were all pretty mechanical. A little too mechanical, in fact.

As I said already, there just aren't that many ways that new nbtree
pages can come into existence -- it's naturally limited to page splits
(including root page splits), and the case where we need to add a new
root page that's also a leaf page at the point that the first ever
tuple is inserted into the index (before that we just have a metapage)
-- so I only have three _bt_allocbuf() callers to worry about. It's
completely self-evident (even to people that know little about nbtree)
that the only type of page access that could possibly need a heapRel
in the first place is P_NEW (i.e., a new page allocation). Once you
know all that, this situation begins to look much more
straightforward.

Now, to be fair to Bertrand, it *looks* more complicated than it
really is, in large part due to the obscure case where VACUUM finishes
an interrupted page split (during page deletion), which itself goes on
to cause a page split one level up. So it's possible (barely) that
VACUUM will enlarge an index by one page, which requires a heapRel,
just like any other place where an index is enlarged by a page split
(I documented all this in commit 35bc0ec7).

I've added several defensive assertions that make it hard to get the
details wrong. These will catch the issue much earlier than the main
"heapRel != NULL" assertion in _bt_allocbuf(). So, the rules are
reasonably straightforward and enforceable.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2023-05-26 17:43:09 Re: pg_collation.collversion for C.UTF-8
Previous Message vignesh C 2023-05-26 17:00:12 Re: Implement generalized sub routine find_in_log for tap test