Re: Faster inserts with mostly-monotonically increasing values

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(dot)riggs(at)2ndquadrant(dot)com>
Subject: Re: Faster inserts with mostly-monotonically increasing values
Date: 2018-01-02 12:51:51
Message-ID: CABOikdMx91+Lou8Z5MQEg=44PT-jybMXaoypXyR=QhJGKbYvDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Dec 31, 2017 at 4:36 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Sun, Dec 31, 2017 at 6:44 AM, Pavan Deolasee
> <pavan(dot)deolasee(at)gmail(dot)com> wrote:
> > Here is a patch that implements the idea. If the last insert happens to
> be
> > in the rightmost block of an index, then we cache the block and check
> that
> > first for the next insert. For the cached block to be usable for the
> insert,
> > it should still be the rightmost, the to-be-inserted key should go into
> the
> > cached block and there is enough free space in the block. If any of these
> > conditions do not meet, we fall back to the regular code path, i.e.
> descent
> > down the index from the top.
>
> I'd be particularly concerned about page deletion/page recycling still
> being correct with the patch, especially because it's hard to test
> anything there. The P_ISLEAF() part of your fastpath's test hints at
> this -- why should it not *always* be a leaf page (surely you should
> be sure that the page cannot be recycled anyway)?

I thought we can throw in a bunch of checks there to ensure that we are
still looking at the correct rightmost page or else fallback to the regular
path. Do you see something missing there? I don't claim that I've got
everything correct, but since we're holding an exclusive lock on the page
at that point, wouldn't we be able to guard against any concurrent
splits/deletions? I will reread the documentation/code again, but I must
admit, it's not particularly easy to understand all the concurrency issues.

> I also have my
> doubts about unique index enforcement remaining correct with the patch
> when there are many physical duplicates, to the extent that more than
> a single leaf page is needed for a single value.
>

Why would that happen? We are checking if the to-be-inserted key is
strictly greater than the last key in the rightmost block and only then
proceeding with the fast path. And we do this while holding an exclusive
lock on the page. Are you suggesting that someone can concurrently still
split the page?

>
> Maybe you should do something with page LSN here -- cache LSN
> alongside block number, and have a non-matching LSN invalidate the
> test.
>

This is a good idea, but we might need to track the block/LSN in shared
memory. Otherwise with multiple backends doing inserts, we might never be
able to match LSNs (i.e. they will mostly vary, thus falling back to slow
path for majority of the time)

>
> How many clients are involved with your benchmark?
>

Just a single client for these benchmarks.

>
> > So as the size of the index increases, the benefits of the patch also
> tend
> > to increase. This is most likely because as the index becomes taller and
> > taller, the costs associated with index descent becomes higher.
>
> FWIW, I think that int4 indexes have to be extremely large before they
> exceed 4 levels (where the leaf level counts as a level). My
> conservative back-of-an-envelope estimate is 9 billion index tuples.
>
>
Hmm Ok. I am not entirely sure whether it's the depth or just purely binary
searching through 3-4 index pages and/or pinning/unpinning buffers result
in much overhead. I will run some more tests and collect evidences.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Darafei Komяpa Praliaskouski 2018-01-02 12:54:24 Re: Better testing coverage and unified coding for plpgsql loops
Previous Message Alvaro Herrera 2018-01-02 12:20:05 Re: Better testing coverage and unified coding for plpgsql loops