Re: Remove Instruction Synchronization Barrier in spin_delay() for ARM64 architecture

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Salvatore Dipietro <dipietro(dot)salvatore(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Salvatore Dipietro <dipiets(at)amazon(dot)com>, blakgeof(at)amazon(dot)com
Subject: Re: Remove Instruction Synchronization Barrier in spin_delay() for ARM64 architecture
Date: 2025-05-01 21:50:36
Message-ID: aBPsrFbjnrqp3_8S@nathan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 01, 2025 at 04:08:06PM -0400, Tom Lane wrote:
> Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:
>> ... commit 3d0b4b1 recently added a non-locking
>> initial test in AArch64's TAS_SPIN, so I wonder if the ISB is still
>> appropriate. It'd be interesting to see the performance difference of
>> removing the ISB with and without commit 3d0b4b1 applied.
>
> Oh! That's an excellent point. The OP didn't mention if their tests
> were done before or after 3d0b4b1, but that might well matter.
>
> I still think pgbench is a very blunt tool for this type of testing,
> though. I recommend resurrecting the test_shm_mq-based hack discussed
> in the prior thread and seeing what that shows.

Well, I have interesting results. This is all on a c8g.24xlarge (96 cores,
Neoverse-V2, Armv9.0-a).

For the first test_shm_mq test, I ran the following:

SELECT test_shm_mq_pipelined(16384, 'xyzzy', 10000000, 1);
SELECT test_shm_mq_pipelined(16384, 'xyzzy', 10000000, 2);
SELECT test_shm_mq_pipelined(16384, 'xyzzy', 10000000, 4);
...

This gave me the following results (values are in seconds):

w/o 3d0b4b1 w/ 3d0b4b1
ISB no ISB ISB no ISB
1 1.4 1.6 1.5 1.6
2 2.1 2.0 2.1 2.1
4 3.2 3.5 3.3 3.5
8 7.4 8.1 7.2 8.4
16 18.0 35.9 22.7 23.4
32 35.7 85.6 53.7 49.5
64 85.1 ? 147.6 100.1

For the second test_shm_mq test, I ran at higher concurrency, so I had to
reduce the loop counts:

SELECT test_shm_mq_pipelined(16384, 'xyzzy', 100000, 32);
...

That gave me the following:

w/o 3d0b4b1 w/ 3d0b4b1
ISB no ISB ISB no ISB
32 0.4 0.8 0.5 0.6
64 2.0 4.8 1.3 1.1
128 6.1 29.3 7.5 2.1
256 43.0 66.4 24.4 4.5

Finally, I ran the pgbench select-only test with
pg_stat_statements.track_planning enabled (values are in thousands of
transactions per second):

w/o 3d0b4b1 w/ 3d0b4b1
ISB no ISB ISB no ISB
71.4 67.4 538.2 891.2

So...

* The ISB does seem to have a positive effect without commit 3d0b4b1
applied.

* With commit 3d0b4b1 applied, removing the ISB seems to have a positive
effect at high concurrencies. This is especially pronounced in the
pgbench test.

* With commit 3d0b4b1 applied, removing the ISB doesn't change much at
lower concurrencies, and there might even be a small regression.

* At mostly lower concurrencies, commit 3d0b4b1 actually seems to regress
some test_shm_mq tests. Removing the ISB instruction appears to help in
some cases, but not all.

--
nathan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-05-01 22:10:19 Re: queryId constant squashing does not support prepared statements
Previous Message David E. Wheeler 2025-05-01 21:01:52 Re: RFC: Additional Directory for Extensions