Prefetch the next tuple's memory during seqscans

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Prefetch the next tuple's memory during seqscans
Date: 2022-10-31 03:52:52
Message-ID: CAApHDvpTRx7hqFZGiZJ=d9JN4h1tzJ2=xt7bM-9XRmpVj63psQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

As part of the AIO work [1], Andres mentioned to me that he found that
prefetching tuple memory during hot pruning showed significant wins.
I'm not proposing anything to improve HOT pruning here, but as a segue
to get the prefetching infrastructure in so that there are fewer AIO
patches, I'm proposing we prefetch the next tuple during sequence
scans in while page mode.

It turns out the gains are pretty good when we apply this:

-- table with 4 bytes of user columns
create table t as select a from generate_series(1,10000000)a;
vacuum freeze t;
select pg_prewarm('t');

Master @ a9f8ca600
# select * from t where a = 0;
Time: 355.001 ms
Time: 354.573 ms
Time: 354.490 ms
Time: 354.556 ms
Time: 354.335 ms

Master + 0001 + 0003:
# select * from t where a = 0;
Time: 328.578 ms
Time: 329.387 ms
Time: 329.349 ms
Time: 329.704 ms
Time: 328.225 ms (avg ~7.7% faster)

-- table with 64 bytes of user columns
create table t2 as
select a,a a2,a a3,a a4,a a5,a a6,a a7,a a8,a a9,a a10,a a11,a a12,a
a13,a a14,a a15,a a16
from generate_series(1,10000000)a;
vacuum freeze t2;
select pg_prewarm('t2');

Master:
# select * from t2 where a = 0;
Time: 501.725 ms
Time: 501.815 ms
Time: 503.225 ms
Time: 501.242 ms
Time: 502.394 ms

Master + 0001 + 0003:
# select * from t2 where a = 0;
Time: 412.076 ms
Time: 410.669 ms
Time: 410.490 ms
Time: 409.782 ms
Time: 410.843 ms (avg ~22% faster)

This was tested on an AMD 3990x CPU. I imagine the CPU matters quite a
bit here. It would be interesting to see if the same or similar gains
can be seen on some modern intel chip too.

I believe Thomas wrote the 0001 patch (same as patch in [2]?). I only
quickly put together the 0003 patch.

I wondered if we might want to add a macro to 0001 that says if
pg_prefetch_mem() is empty or not then use that to #ifdef out the code
I added to heapam.c. Although, perhaps most compilers will be able to
optimise away the extra lines that are figuring out what the address
of the next tuple is.

My tests above are likely the best case for this. It seems plausible
to me that if there was a much more complex plan that found a
reasonable number of tuples and did something with them that we
wouldn't see the same sort of gains. Also, it also does not seem
impossible that the prefetch just results in evicting some
useful-to-some-other-exec-node cache line or that the prefetched tuple
gets flushed out the cache by the time we get around to fetching the
next tuple from the scan again due to various other node processing
that's occurred since the seq scan was last called. I imagine such
things would be indistinguishable from noise, but I've not tested.

I also tried prefetching out by 2 tuples. It didn't help any further
than prefetching 1 tuple.

I'll add this to the November CF.

David

[1] https://www.postgresql.org/message-id/flat/20210223100344(dot)llw5an2aklengrmn(at)alap3(dot)anarazel(dot)de
[2] https://www.postgresql.org/message-id/CA%2BhUKG%2Bpi63ZbcZkYK3XB1pfN%3DkuaDaeV0Ha9E%2BX_p6TTbKBYw%40mail.gmail.com

Attachment Content-Type Size
0001-Add-pg_prefetch_mem-macro-to-load-cache-lines.patch text/plain 5.2 KB
0003-Prefetch-tuple-memory-during-forward-seqscans.patch text/plain 1.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2022-10-31 04:26:22 Re: pg15 inherited stats expressions: cache lookup failed for statistics object
Previous Message samay sharma 2022-10-31 03:51:59 Re: Documentation for building with meson