From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Salvatore Dipietro <dipietro(dot)salvatore(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Salvatore Dipietro <dipiets(at)amazon(dot)com>, blakgeof(at)amazon(dot)com |
Subject: | Re: Remove Instruction Synchronization Barrier in spin_delay() for ARM64 architecture |
Date: | 2025-05-01 21:50:36 |
Message-ID: | aBPsrFbjnrqp3_8S@nathan |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, May 01, 2025 at 04:08:06PM -0400, Tom Lane wrote:
> Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:
>> ... commit 3d0b4b1 recently added a non-locking
>> initial test in AArch64's TAS_SPIN, so I wonder if the ISB is still
>> appropriate. It'd be interesting to see the performance difference of
>> removing the ISB with and without commit 3d0b4b1 applied.
>
> Oh! That's an excellent point. The OP didn't mention if their tests
> were done before or after 3d0b4b1, but that might well matter.
>
> I still think pgbench is a very blunt tool for this type of testing,
> though. I recommend resurrecting the test_shm_mq-based hack discussed
> in the prior thread and seeing what that shows.
Well, I have interesting results. This is all on a c8g.24xlarge (96 cores,
Neoverse-V2, Armv9.0-a).
For the first test_shm_mq test, I ran the following:
SELECT test_shm_mq_pipelined(16384, 'xyzzy', 10000000, 1);
SELECT test_shm_mq_pipelined(16384, 'xyzzy', 10000000, 2);
SELECT test_shm_mq_pipelined(16384, 'xyzzy', 10000000, 4);
...
This gave me the following results (values are in seconds):
w/o 3d0b4b1 w/ 3d0b4b1
ISB no ISB ISB no ISB
1 1.4 1.6 1.5 1.6
2 2.1 2.0 2.1 2.1
4 3.2 3.5 3.3 3.5
8 7.4 8.1 7.2 8.4
16 18.0 35.9 22.7 23.4
32 35.7 85.6 53.7 49.5
64 85.1 ? 147.6 100.1
For the second test_shm_mq test, I ran at higher concurrency, so I had to
reduce the loop counts:
SELECT test_shm_mq_pipelined(16384, 'xyzzy', 100000, 32);
...
That gave me the following:
w/o 3d0b4b1 w/ 3d0b4b1
ISB no ISB ISB no ISB
32 0.4 0.8 0.5 0.6
64 2.0 4.8 1.3 1.1
128 6.1 29.3 7.5 2.1
256 43.0 66.4 24.4 4.5
Finally, I ran the pgbench select-only test with
pg_stat_statements.track_planning enabled (values are in thousands of
transactions per second):
w/o 3d0b4b1 w/ 3d0b4b1
ISB no ISB ISB no ISB
71.4 67.4 538.2 891.2
So...
* The ISB does seem to have a positive effect without commit 3d0b4b1
applied.
* With commit 3d0b4b1 applied, removing the ISB seems to have a positive
effect at high concurrencies. This is especially pronounced in the
pgbench test.
* With commit 3d0b4b1 applied, removing the ISB doesn't change much at
lower concurrencies, and there might even be a small regression.
* At mostly lower concurrencies, commit 3d0b4b1 actually seems to regress
some test_shm_mq tests. Removing the ISB instruction appears to help in
some cases, but not all.
--
nathan
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2025-05-01 22:10:19 | Re: queryId constant squashing does not support prepared statements |
Previous Message | David E. Wheeler | 2025-05-01 21:01:52 | Re: RFC: Additional Directory for Extensions |