Re: Faster inserts with mostly-monotonically increasing values

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(dot)riggs(at)2ndquadrant(dot)com>
Subject: Re: Faster inserts with mostly-monotonically increasing values
Date: 2017-12-31 11:06:21
Message-ID: CAH2-Wz=X8_FxPO0wkhcEdmTm5x7Ksqx6ix756Nu6KYCAbR7_VQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Dec 31, 2017 at 6:44 AM, Pavan Deolasee
<pavan(dot)deolasee(at)gmail(dot)com> wrote:
> Here is a patch that implements the idea. If the last insert happens to be
> in the rightmost block of an index, then we cache the block and check that
> first for the next insert. For the cached block to be usable for the insert,
> it should still be the rightmost, the to-be-inserted key should go into the
> cached block and there is enough free space in the block. If any of these
> conditions do not meet, we fall back to the regular code path, i.e. descent
> down the index from the top.

I'd be particularly concerned about page deletion/page recycling still
being correct with the patch, especially because it's hard to test
anything there. The P_ISLEAF() part of your fastpath's test hints at
this -- why should it not *always* be a leaf page (surely you should
be sure that the page cannot be recycled anyway)? I also have my
doubts about unique index enforcement remaining correct with the patch
when there are many physical duplicates, to the extent that more than
a single leaf page is needed for a single value.

Maybe you should do something with page LSN here -- cache LSN
alongside block number, and have a non-matching LSN invalidate the
test.

How many clients are involved with your benchmark?

> So as the size of the index increases, the benefits of the patch also tend
> to increase. This is most likely because as the index becomes taller and
> taller, the costs associated with index descent becomes higher.

FWIW, I think that int4 indexes have to be extremely large before they
exceed 4 levels (where the leaf level counts as a level). My
conservative back-of-an-envelope estimate is 9 billion index tuples.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2017-12-31 11:18:31 Re: [PATCH] GET DIAGNOSTICS FUNCTION_NAME
Previous Message Magnus Hagander 2017-12-31 10:58:22 Re: Logical decoding fast-forward and slot advance