Re: Prefetch the next tuple's memory during seqscans

From: vignesh C <vignesh21(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Prefetch the next tuple's memory during seqscans
Date: 2023-01-04 10:06:27
Message-ID: CALDaNm3Cy3w_kn-bkeHycVZM-Xhytqas+5xoaaUYvOuq5N+mJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 23 Nov 2022 at 03:28, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Thu, 3 Nov 2022 at 06:25, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Attached is an experimental patch/hack for that. It ended up being more
> > beneficial to make the access ordering more optimal than prefetching the tuple
> > contents, but I'm not at all sure that's the be-all-end-all.
>
> Thanks for writing that patch. I've been experimenting with it.
>
> I tried unrolling the loop (patch 0003) as you mentioned in:
>
> + * FIXME: Worth unrolling so that we don't fetch the same cacheline
> + * over and over, due to line items being smaller than a cacheline?
>
> but didn't see any gains from doing that.
>
> I also adjusted your patch a little so that instead of doing:
>
> - OffsetNumber rs_vistuples[MaxHeapTuplesPerPage]; /* their offsets */
> + OffsetNumber *rs_vistuples;
> + OffsetNumber rs_vistuples_d[MaxHeapTuplesPerPage]; /* their offsets */
>
> to work around the issue of having to populate rs_vistuples_d in
> reverse, I added a new field called rs_startindex to mark where the
> first element in the rs_vistuples array is. The way you wrote it seems
> to require fewer code changes, but per the FIXME comment you left, I
> get the idea you just did it the way you did to make it work enough
> for testing.
>
> I'm quite keen to move forward in committing the 0001 patch to add the
> pg_prefetch_mem macro. What I'm a little undecided about is what the
> best patch is to commit first to make use of the new macro.
>
> I did some tests on the attached set of patches:
>
> alter system set max_parallel_workers_per_gather = 0;
> select pg_reload_conf();
>
> create table t as select a from generate_series(1,10000000)a;
> alter table t set (autovacuum_enabled=false);
>
> $ cat bench.sql
> select * from t where a = 0;
>
> psql -c "select pg_prewarm('t');" postgres
>
> -- Test 1 no frozen tuples in "t"
>
> Master (@9c6ad5eaa):
> $ pgbench -n -f bench.sql -M prepared -T 10 postgres | grep -E "^latency"
> latency average = 383.332 ms
> latency average = 375.747 ms
> latency average = 376.090 ms
>
> Master + 0001 + 0002:
> $ pgbench -n -f bench.sql -M prepared -T 10 postgres | grep -E "^latency"
> latency average = 370.133 ms
> latency average = 370.149 ms
> latency average = 370.157 ms
>
> Master + 0001 + 0005:
> $ pgbench -n -f bench.sql -M prepared -T 10 postgres | grep -E "^latency"
> latency average = 372.662 ms
> latency average = 371.034 ms
> latency average = 372.709 ms
>
> -- Test 2 "select count(*) from t" with all tuples frozen
>
> $ cat bench1.sql
> select count(*) from t;
>
> psql -c "vacuum freeze t;" postgres
> psql -c "select pg_prewarm('t');" postgres
>
> Master (@9c6ad5eaa):
> $ pgbench -n -f bench1.sql -M prepared -T 10 postgres | grep -E "^latency"
> latency average = 406.238 ms
> latency average = 407.029 ms
> latency average = 406.962 ms
>
> Master + 0001 + 0005:
> $ pgbench -n -f bench1.sql -M prepared -T 10 postgres | grep -E "^latency"
> latency average = 345.470 ms
> latency average = 345.775 ms
> latency average = 345.354 ms
>
> My current thoughts are that it might be best to go with 0005 to start
> with. I know Melanie is working on making some changes in this area,
> so perhaps it's best to leave 0002 until that work is complete.

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
5212d447fa53518458cbe609092b347803a667c5 ===
=== applying patch ./v2-0001-Add-pg_prefetch_mem-macro-to-load-cache-lines.patch
=== applying patch ./v2-0002-Perform-memory-prefetching-in-heapgetpage.patch
patching file src/backend/access/heap/heapam.c
Hunk #1 FAILED at 451.
1 out of 6 hunks FAILED -- saving rejects to file
src/backend/access/heap/heapam.c.rej

[1] - http://cfbot.cputube.org/patch_41_3978.log

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2023-01-04 10:12:21 Re: WIP: Aggregation push-down - take2
Previous Message Pavel Borisov 2023-01-04 09:52:45 Re: POC: Lock updated tuples in tuple_update() and tuple_delete()