Re: memory layouts for binary search in nbtree

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: memory layouts for binary search in nbtree
Date: 2017-06-27 17:57:15
Message-ID: CAH2-WzmW9LjTfzp7uY4CHsF3NH0YM-_pow3LXRh18ORLgeu48A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 19, 2016 at 7:28 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> Abbreviated keys in indexes are supposed to help with this. Basically,
> the ItemId array is made to be interlaced with small abbreviated keys
> (say one or two bytes), only in the typically less than 1% of pages
> that are internal (leaf page binary searches don't change).

I looked at this again recently. I wrote a patch to prove to myself
that we can fairly easily reclaim 15 bits from every nbtree internal
page ItemId, and put an abbreviated key there instead. The patch has
nbtree tell PageAdditem() to store an abbreviated key within lp_len
(actually, we call PageAddItemAbbrev() now). I didn't realize that
stealing lp_len might be feasible until recently, but it appears to be
-- this is a lot simpler than other approaches might be. I already
implemented a rudimentary, POC encoding scheme for int4, but text is
the datatype that I'd expect to see a real benefit for.

While this POC patch of mine could easily have bugs, it is at least
true that "make check-world" passes for me. For nbtree, reclaiming the
lp_len space was just a matter of teaching a small number of existing
places to go to the IndexTuple to get a tuple's length, rather than
trusting an ItemId's lp_len field (that is, rather than calling
ItemIdGetLength()). Most nbtree related code already gets the length
from the index tuple header today, since it's pretty rare to care
about the length of a tuple but not its contents. This did require
updating some generic routines within bufpage.c, too, but that wasn't
so bad.

Can you suggest a workload/hardware to assess the benefits of an
optimization like this, Andres? Is there a profile available somewhere
in the archives that shows many cache misses within _bt_binsrch()?

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-06-27 18:04:54 Re: memory layouts for binary search in nbtree
Previous Message Petr Jelinek 2017-06-27 16:47:31 Re: Get stuck when dropping a subscription during synchronizing table