Re: Prefetch the next tuple's memory during seqscans

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Prefetch the next tuple's memory during seqscans
Date: 2022-11-22 21:58:07
Message-ID: CAApHDvo1TL1Bm6AWnPwGYVya82TA3fuqRQsjvTjA+=NzHa6cuw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 3 Nov 2022 at 06:25, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Attached is an experimental patch/hack for that. It ended up being more
> beneficial to make the access ordering more optimal than prefetching the tuple
> contents, but I'm not at all sure that's the be-all-end-all.

Thanks for writing that patch. I've been experimenting with it.

I tried unrolling the loop (patch 0003) as you mentioned in:

+ * FIXME: Worth unrolling so that we don't fetch the same cacheline
+ * over and over, due to line items being smaller than a cacheline?

but didn't see any gains from doing that.

I also adjusted your patch a little so that instead of doing:

- OffsetNumber rs_vistuples[MaxHeapTuplesPerPage]; /* their offsets */
+ OffsetNumber *rs_vistuples;
+ OffsetNumber rs_vistuples_d[MaxHeapTuplesPerPage]; /* their offsets */

to work around the issue of having to populate rs_vistuples_d in
reverse, I added a new field called rs_startindex to mark where the
first element in the rs_vistuples array is. The way you wrote it seems
to require fewer code changes, but per the FIXME comment you left, I
get the idea you just did it the way you did to make it work enough
for testing.

I'm quite keen to move forward in committing the 0001 patch to add the
pg_prefetch_mem macro. What I'm a little undecided about is what the
best patch is to commit first to make use of the new macro.

I did some tests on the attached set of patches:

alter system set max_parallel_workers_per_gather = 0;
select pg_reload_conf();

create table t as select a from generate_series(1,10000000)a;
alter table t set (autovacuum_enabled=false);

$ cat bench.sql
select * from t where a = 0;

psql -c "select pg_prewarm('t');" postgres

-- Test 1 no frozen tuples in "t"

Master (@9c6ad5eaa):
$ pgbench -n -f bench.sql -M prepared -T 10 postgres | grep -E "^latency"
latency average = 383.332 ms
latency average = 375.747 ms
latency average = 376.090 ms

Master + 0001 + 0002:
$ pgbench -n -f bench.sql -M prepared -T 10 postgres | grep -E "^latency"
latency average = 370.133 ms
latency average = 370.149 ms
latency average = 370.157 ms

Master + 0001 + 0005:
$ pgbench -n -f bench.sql -M prepared -T 10 postgres | grep -E "^latency"
latency average = 372.662 ms
latency average = 371.034 ms
latency average = 372.709 ms

-- Test 2 "select count(*) from t" with all tuples frozen

$ cat bench1.sql
select count(*) from t;

psql -c "vacuum freeze t;" postgres
psql -c "select pg_prewarm('t');" postgres

Master (@9c6ad5eaa):
$ pgbench -n -f bench1.sql -M prepared -T 10 postgres | grep -E "^latency"
latency average = 406.238 ms
latency average = 407.029 ms
latency average = 406.962 ms

Master + 0001 + 0005:
$ pgbench -n -f bench1.sql -M prepared -T 10 postgres | grep -E "^latency"
latency average = 345.470 ms
latency average = 345.775 ms
latency average = 345.354 ms

My current thoughts are that it might be best to go with 0005 to start
with. I know Melanie is working on making some changes in this area,
so perhaps it's best to leave 0002 until that work is complete.

David

Attachment Content-Type Size
v2-0001-Add-pg_prefetch_mem-macro-to-load-cache-lines.patch text/plain 5.3 KB
v2-0002-Perform-memory-prefetching-in-heapgetpage.patch text/plain 6.3 KB
v2-0005-Prefetch-tuple-memory-during-forward-seqscans.patch text/plain 1.0 KB
v2-0004-heapam-WIP-cacheline-prefetching-for-hot-pruning.patch text/plain 1.4 KB
v2-0003-Unroll-loop-in-heapgetpage.patch text/plain 2.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2022-11-22 22:00:27 Re: Prefetch the next tuple's memory during seqscans
Previous Message Michail Nikolaev 2022-11-22 21:53:33 Re: Slow standby snapshot