Re: Prefetch the next tuple's memory during seqscans

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Prefetch the next tuple's memory during seqscans
Date: 2022-10-31 22:17:14
Message-ID: CAApHDvqWexy_6jGmB39Vr3OqxZ_w6stAFkq52hODvwaW-19aiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 1 Nov 2022 at 03:12, Aleksander Alekseev
<aleksander(at)timescale(dot)com> wrote:
> I wonder if we can be sure and/or check that there is no performance
> degradation under different loads and different platforms...

Different platforms would be good. Certainly, 1 platform isn't a good
enough indication that this is going to be useful.

As for different loads. I imagine the worst case for this will be that
the prefetched tuple is flushed from the cache by some other operation
in the plan making the prefetch useless.

I tried the following so that we read 1 million tuples from a Sort
node before coming back and reading another tuple from the seqscan.

create table a as select 1 as a from generate_series(1,2) a;
create table b as select 1 as a from generate_series(1,10000000) a;
vacuum freeze a,b;

select pg_prewarm('a'),pg_prewarm('b');
set work_mem = '256MB';
select * from a, lateral (select * from b order by a) b offset 20000000;

Master (@ a9f8ca600)
Time: 1414.590 ms (00:01.415)
Time: 1373.584 ms (00:01.374)
Time: 1373.057 ms (00:01.373)
Time: 1383.033 ms (00:01.383)
Time: 1378.865 ms (00:01.379)

Master + 0001 + 0003:
Time: 1352.726 ms (00:01.353)
Time: 1348.306 ms (00:01.348)
Time: 1358.033 ms (00:01.358)
Time: 1354.348 ms (00:01.354)
Time: 1353.971 ms (00:01.354)

As I'd have expected, I see no regression. It's hard to imagine we'd
be able to measure the regression over the overhead of some operation
that would evict everything out of cache.

FWIW, this CPU has a 256MB L3 cache and the Sort node's EXPLAIN
ANALYZE looks like:

Sort Method: quicksort Memory: 262144kB

> Also I see 0001 and 0003 but no 0002. Just wanted to double check that
> there is no patch missing.

Perhaps I should resequence the patches to avoid confusion. I didn't
send 0002 on purpose. The 0002 is Andres' patch to prefetch during HOT
pruning. Here I'm only interested in seeing if we can get the
pg_prefetch_mem macros in core to reduce the number of AIO patches by
1.

Another thing about this is that I'm really only fetching the first
cache line of the tuple. All columns in the t2 table (from the
earlier email) are fixed width, so accessing the a16 column is a
cached offset. I ran a benchmark using the same t2 table as my earlier
email, i.e:

-- table with 64 bytes of user columns
create table t2 as
select a,a a2,a a3,a a4,a a5,a a6,a a7,a a8,a a9,a a10,a a11,a a12,a
a13,a a14,a a15,a a16
from generate_series(1,10000000)a;
vacuum freeze t2;

My test is to run 16 queries changing the WHERE clause each time to
have WHERE a = 0, then WHERE a2 = 0 ... WHERE a16 = 0. I wanted to
know if prefetching only the first cache line of the tuple would be
less useful when we require evaluation of say, the "a16" column vs the
"a" column.

The times below (in milliseconds) are what I got from a 10-second pgbench run:

column master patched
a 490.571 409.748
a2 428.004 430.927
a3 449.156 453.858
a4 474.945 479.73
a5 514.646 507.809
a6 517.525 519.956
a7 543.587 539.023
a8 562.718 559.387
a9 585.458 584.63
a10 609.143 604.606
a11 645.273 638.535
a12 658.848 657.377
a13 696.395 685.389
a14 702.779 716.722
a15 727.161 723.567
a16 756.186 749.396

I'm not sure how to explain why only the "a" column seems to improve
and the rest seem mostly unaffected.

David

Attachment Content-Type Size
bench_prefetch.sh.txt text/plain 269 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-10-31 22:37:44 Re: heapgettup refactoring
Previous Message Dagfinn Ilmari Mannsåker 2022-10-31 22:05:21 Re: User functions for building SCRAM secrets