Re: Prefetch the next tuple's memory during seqscans

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Prefetch the next tuple's memory during seqscans
Date: 2022-11-02 17:25:44
Message-ID: 20221102172544.hoszrut7tfepc3dc@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-11-01 20:00:43 -0700, Andres Freund wrote:
> I suspect that prefetching in heapgetpage() would provide gains as well, at
> least for pages that aren't marked all-visible, pretty common in the real
> world IME.

Attached is an experimental patch/hack for that. It ended up being more
beneficial to make the access ordering more optimal than prefetching the tuple
contents, but I'm not at all sure that's the be-all-end-all.

I separately benchmarked pinning the CPU and memory to the same socket,
different socket and interleaving memory.

I did this for HEAD, your patch, your patch and mine.

BEGIN; DROP TABLE IF EXISTS large; CREATE TABLE large(a int8 not null, b int8 not null default '0', c int8); INSERT INTO large SELECT generate_series(1, 50000000);COMMIT;

server is started with
local: numactl --membind 1 --physcpubind 10
remote: numactl --membind 0 --physcpubind 10
interleave: numactl --interleave=all --physcpubind 10

benchmark stared with:
psql -qX -f ~/tmp/prewarm.sql && \
pgbench -n -f ~/tmp/seqbench.sql -t 1 -r > /dev/null && \
perf stat -e task-clock,LLC-loads,LLC-load-misses,cycles,instructions -C
10 \
pgbench -n -f ~/tmp/seqbench.sql -t 3 -r

seqbench.sql:
SELECT count(*) FROM large WHERE c IS NOT NULL;
SELECT sum(a), sum(b), sum(c) FROM large;
SELECT sum(c) FROM large;

branch memory time s miss %
head local 31.612 74.03
david local 32.034 73.54
david+andres local 31.644 42.80
andres local 30.863 48.05

head remote 33.350 72.12
david remote 33.425 71.30
david+andres remote 32.428 49.57
andres remote 30.907 44.33

head interleave 32.465 71.33
david interleave 33.176 72.60
david+andres interleave 32.590 46.23
andres interleave 30.440 45.13

It's cool seeing how doing optimizing heapgetpage seems to pretty much remove
the performance difference between local / remote memory.

It makes some sense that David's patch doesn't help in this case - without
all-visible being set the tuple headers will have already been pulled in for
the HTSV call.

I've not yet experimented with moving the prefetch for the tuple contents from
David's location to before the HTSV. I suspect that might benefit both
workloads.

Greetings,

Andres Freund

Attachment Content-Type Size
prefetch-heapgetpage.diff text/x-diff 5.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-11-02 17:27:06 Re: spinlock support on loongarch64
Previous Message Tom Lane 2022-11-02 17:20:28 Re: Error for row-level triggers with transition tables on partitioned tables