Quick Links

Re: pgsql: Optimize btree insertions for common case of increasing values

From:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To:	Peter Geoghegan <pg(at)bowt(dot)ie>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Optimize btree insertions for common case of increasing values
Date:	2018-04-05 13:27:40
Message-ID:	CABOikdNoffgt3XkSKPipawi1QYXxEgs67ZV9OsTVEb6jpQN+iw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-committers

On Wed, Apr 4, 2018 at 10:46 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

>
>
> > You mean passing "fastpath" to _bt_insertonpg and then checking it's
> false
> > if page split is needed? But isn't page split is only needed if the page
> > doesn't have enough free space? If so, we have checked for that before
> > setting "fastpath".
>
> That's not exactly what I meant. I meant that if:
>
> 1. This is an insertion to the leaf level in _bt_insertonpg().
>
> and
>
> 2. We don't have space on the page, and so must do a split (or at
> least free LP_DEAD items).
>
> and
>
> 3. RelationGetTargetBlock(rel) != InvalidBlockNumber
>
> There should be an assertion failure. This new assertion within
> _bt_insertonpg() makes it much less likely that the assumption breaks
> later.
>

I came up with something like this:

+ /*
+ * If we're here then a pagesplit is needed. We should
never reach here
+ * if we're using the fastpath since we should have checked
for all the
+ * required conditions, including the fact that this page
has enough
+ * freespace. Note that this routine can in-theory deal
with the
+ * situation where a NULL stack pointer is passed (that's
what would
+ * happen if the fastpath is taken), like it does during
crash
+ * recovery. But that path is much slower, defeating the
very purpose
+ * of the optimisation. The following assertion should
protect us from
+ * any future code changes that invalidate those
assumptions.
+ *
+ * Note the fact that whenever we fail to take the
fastpath, we clear
+ * the cached block. So checking for a valid cached block
at this point
+ * is enough to decide whether we're in a fastpath or not.
+ */
+ Assert(!(P_ISLEAF(lpageop) &&
+
BlockNumberIsValid(RelationGetTargetBlock(rel))));
+

But then I started thinking can _bt_insertonpg() be called from a code path
which doesn't start at _bt_doinsert()? I traced one such path
_bt_getstackbuf() -> _bt_finish_split() -> _bt_insert_parent(), but that
can only happen in case of a non-leaf page. The assertion which checks for
a leaf page, should be fine, right?

>
> Finally, I'd like to see a small paragraph in the nbtree README, about
> the high level theory behind this optimization and page recycling. The
> assumption is that there can only be one non-ignorable leaf rightmost
> page, and so even a RecentGlobalXmin style interlock isn't required.
> We cannot fail to detect that our hint was invalidated, because there
> can only be one such page in the B-Tree at any time. It's possible
> that the page will be deleted and recycled without a backend's cached
> page also being detected as invalidated, but only when we happen to
> recycle a page that once again becomes the rightmost leaf page.
>
>
Can I borrow that almost verbatim, after adding details about the
optimisation itself? Looks like I'm too tired to think sensibly.

> Once those changes are made, this should be fine to commit.
>

Ok, thanks.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: pgsql: Optimize btree insertions for common case of increasing values at 2018-04-04 17:16:50 from Peter Geoghegan

Responses

Re: pgsql: Optimize btree insertions for common case of increasing values at 2018-04-05 17:16:52 from Peter Geoghegan

Browse pgsql-committers by date

	From	Date	Subject
Next Message	Teodor Sigaev	2018-04-05 14:56:18	pgsql: Fix handling of non-upgraded B-tree metapages
Previous Message	Alvaro Herrera	2018-04-05 13:17:48	Re: pgsql: Validate page level checksums in base backups