Re: Experimenting with hash join prefetch

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Experimenting with hash join prefetch
Date: 2020-02-03 12:48:49
Message-ID: CA+hUKG+pi63ZbcZkYK3XB1pfN=kuaDaeV0Ha9E+X_p6TTbKBYw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 12, 2019 at 4:23 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> ... if we also prefetch during
> the build phase ...

Here's an experimental patch to investigate just that part. I tried
initiating a prefetch of the bucket just before we copy the tuple and
then finally insert it, but it doesn't seem to be far enough apart (at
least for small tuples), which is a shame because that'd be a one line
patch. So I made a little insert queue that prefetches and defers the
insertion until N tuples later, and then I managed to get between 10%
and 20% speed-up for contrived tests like this:

create unlogged table t as select generate_series(1, 100000000)::int i;
select pg_prewarm('t');
set work_mem='8GB';

select count(*) from t t1 join t t2 using (i);

master patched/N=1 patched/N=4
workers=0 89.808s 80.556s (+11%) 76.864 (+16%)
workers=2 27.010s 22.679s (+19%) 23.503 (+15%)
workers=4 16.728s 14.896s (+12%) 14.167 (+18%)

Just an early experiment, but I though it looked promising enough to share.

Attachment Content-Type Size
0001-Prefetch-cache-lines-while-building-hash-join-table.patch application/octet-stream 14.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-02-03 12:54:22 Re: ERROR: subtransaction logged without previous top-level txn record
Previous Message Andres Freund 2020-02-03 12:47:31 Re: base backup client as auxiliary backend process