Re: Prefetch the next tuple's memory during seqscans

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Prefetch the next tuple's memory during seqscans
Date: 2022-12-02 01:47:55
Message-ID: CAApHDvraHHcaMp27MRW9j=+JwJ9_K8QujyhzxFcrWQpEv6DF9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 1 Dec 2022 at 18:18, John Naylor <john(dot)naylor(at)enterprisedb(dot)com> wrote:
> I then tested a Power8 machine (also kernel 3.10 gcc 4.8). Configure reports "checking for __builtin_prefetch... yes", but I don't think it does anything here, as the results are within noise level. A quick search didn't turn up anything informative on this platform, and I'm not motivated to dig deeper. In any case, it doesn't make things worse.

Thanks for testing the power8 hardware.

Andres just let me test on some Apple M1 hardware (those cores are
insanely fast!)

Using the table and running the script from [1], with trimmed-down
output, I see:

Master @ edf12e7bbd

Testing a -> 158.037 ms
Testing a2 -> 164.442 ms
Testing a3 -> 171.523 ms
Testing a4 -> 189.892 ms
Testing a5 -> 217.197 ms
Testing a6 -> 186.790 ms
Testing a7 -> 189.491 ms
Testing a8 -> 195.384 ms
Testing a9 -> 200.547 ms
Testing a10 -> 206.149 ms
Testing a11 -> 211.708 ms
Testing a12 -> 217.976 ms
Testing a13 -> 224.565 ms
Testing a14 -> 230.642 ms
Testing a15 -> 237.372 ms
Testing a16 -> 244.110 ms

(checking for __builtin_prefetch... yes)

Master + v2-0001 + v2-0005

Testing a -> 157.477 ms
Testing a2 -> 163.720 ms
Testing a3 -> 171.159 ms
Testing a4 -> 186.837 ms
Testing a5 -> 205.220 ms
Testing a6 -> 184.585 ms
Testing a7 -> 189.879 ms
Testing a8 -> 195.650 ms
Testing a9 -> 201.220 ms
Testing a10 -> 207.162 ms
Testing a11 -> 213.255 ms
Testing a12 -> 219.313 ms
Testing a13 -> 225.763 ms
Testing a14 -> 237.337 ms
Testing a15 -> 239.440 ms
Testing a16 -> 245.740 ms

It does not seem like there's any improvement on this architecture.
There is a very small increase from "a" to "a6", but a very small
decrease in performance from "a7" to "a16". It's likely within the
expected noise level.

David

[1] https://postgr.es/m/CAApHDvqWexy_6jGmB39Vr3OqxZ_w6stAFkq52hODvwaW-19aiA@mail.gmail.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-12-02 01:56:18 Re: Using AF_UNIX sockets always for tests on Windows
Previous Message Andres Freund 2022-12-02 01:42:25 Re: Using AF_UNIX sockets always for tests on Windows