Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers

From: "Greg Burd" <greg(at)burd(dot)me>
To: "Andres Freund" <andres(at)anarazel(dot)de>, "Nathan Bossart" <nathandbossart(at)gmail(dot)com>
Cc: "Thomas Munro" <thomas(dot)munro(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Peter Eisentraut" <peter(at)eisentraut(dot)org>, "Dave Cramer" <davecramer(at)gmail(dot)com>
Subject: Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers
Date: 2025-12-16 12:40:13
Message-ID: b523fa2d-7683-40bc-bd94-5d4fd4264d46@app.fastmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Mon, Dec 15, 2025, at 5:32 PM, Andres Freund wrote:
> Hi,
>
> On 2025-12-15 15:38:49 -0600, Nathan Bossart wrote:
>> +++ b/meson.build
>> @@ -2523,7 +2523,8 @@ int main(void)
>> }
>> '''
>>
>> - if cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd without -march=armv8-a+crc',
>> + if (host_cpu == 'aarch64' and cc.get_id() == 'msvc') or \
>> + cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd without -march=armv8-a+crc',
>> args: test_c_args)
>> # Use ARM CRC Extension unconditionally
>> cdata.set('USE_ARMV8_CRC32C', 1)
>
> I still think this should have a comment explaining that we can
> unconditionally rely on crc32 support due to window's baseline requirements.
>
>
>> diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
>> index 7f8f566bd40..e62141abf0a 100644
>> --- a/src/include/storage/s_lock.h
>> +++ b/src/include/storage/s_lock.h
>> @@ -602,13 +602,21 @@ typedef LONG slock_t;
>>
>> #define SPIN_DELAY() spin_delay()
>>
>> -/* If using Visual C++ on Win64, inline assembly is unavailable.
>> - * Use a _mm_pause intrinsic instead of rep nop.
>> - */
>> -#if defined(_WIN64)
>> +#ifdef _M_ARM64
>> +static __forceinline void
>> +spin_delay(void)
>> +{
>> + /* Research indicates ISB is better than __yield() on AArch64. */
>> + __isb(_ARM64_BARRIER_SY);
>
> It'd be good to link the research in some form or another. Otherwise it's
> harder to evolve the code in the future, because we don't know if the research
> was "I liked the color better" or "one is catastrophically slower than the
> other".
>
>> +}
>> +#elif defined(_WIN64)
>> static __forceinline void
>> spin_delay(void)
>> {
>> + /*
>> + * If using Visual C++ on Win64, inline assembly is unavailable.
>> + * Use a _mm_pause intrinsic instead of rep nop.
>> + */
>> _mm_pause();
>> }
>> #else
>> @@ -621,12 +629,19 @@ spin_delay(void)
>> #endif
>>
>> #include <intrin.h>
>> +#ifdef _M_ARM64
>> +#pragma intrinsic(_InterlockedExchange)
>> +
>> +/* _ReadWriteBarrier() is insufficient on non-TSO architectures. */
>> +#define S_UNLOCK(lock) _InterlockedExchange(lock, 0)
>> +#else
>> #pragma intrinsic(_ReadWriteBarrier)
>>
>> #define S_UNLOCK(lock) \
>> do { _ReadWriteBarrier(); (*(lock)) = 0; } while (0)
>>
>> #endif
>> +#endif
>
> The newline placement looks odd here. I'd add newlines around the #else.

Hey Andres, thanks for chiming in! I agree with your suggestions and it looks like Nathan has already addressed them to your satisfaction. I've applied v11 on my "unicorn" Win11 ARM64 MSVC system and it's testing now.

best.

-greg

> Greetings,
>
> Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2025-12-16 13:00:23 Re: Error position support for ComputeIndexAttrs
Previous Message Greg Burd 2025-12-16 12:37:10 Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers