Re: B-tree cache prefetches

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: B-tree cache prefetches
Date: 2018-08-30 21:40:59
Message-ID: CAEepm=0EGqAyoEgoaW8j5NruhpFvTAyibE60AH3-wtMEdFv18Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 31, 2018 at 5:53 AM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
> Hi hackers!
>
> I've been at the database conference and here everyone is talking about cache prefetches.
>
> I've tried simple hack
>
> diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
> index d3700bd082..ffddf553aa 100644
> --- a/src/backend/access/nbtree/nbtsearch.c
> +++ b/src/backend/access/nbtree/nbtsearch.c
> @@ -401,6 +401,13 @@ _bt_binsrch(Relation rel,
>
> /* We have low <= mid < high, so mid points at a real slot */
>
> + OffsetNumber x = mid + 1 + ((high - mid + 1) / 2);
> + if (x < high)
> + __builtin_prefetch (PageGetItem(page, PageGetItemId(page, x)), 0, 2);
> + x = low + ((mid - low) / 2);
> + if (x > low)
> + __builtin_prefetch (PageGetItem(page, PageGetItemId(page, x)), 0, 2);
> +
> result = _bt_compare(rel, keysz, scankey, page, mid);
>
> if (result >= cmpval)
>
> The idea is pretty simple - our search are cache erasing anyway, let's try to get at least some of it by prefetching possible ways of binary search.
> And it seems to me that on a simple query
> > insert into x select (random()*1000000)::int from generate_series(1,1e7);
> it brings something like 2-4% of performance improvement on my laptop.
>
> Is there a reason why we do not use __builtin_prefetch? Have anyone tried to use cache prefetching?

A related topic is the cache-unfriendliness of traditional binary
searches of sorted data. Check out "Array Layouts for
Comparison-Based Searching"[1] if you haven't already. It says that
if your array fits in L2 cache, as our btree pages do, then binary
search still wins against Eytzinger et al, which I was a bit
disappointed about when I was (casually) investigating new layouts
based on some comments from a fellow hacker (I can't remember if it
was Andres or Peter G who mentioned this topic to me). However, the
paper is talking about "branch-free binary search with explicit
prefetch", which apparently eats plain old branchy binary search for
breakfast, and I gather they tested on a lot of hardware. That sounds
interesting to look into.

[1] https://arxiv.org/pdf/1509.05053.pdf

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-08-30 21:46:06 Re: pg_verify_checksums and -fno-strict-aliasing
Previous Message Tom Lane 2018-08-30 21:38:36 Re: Startup cost of sequential scan