| From: | Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> | 
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> | 
| Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> | 
| Subject: | SpinLockAcquire and SpinLockRelease is broken on ARM/ARM64? (Re: sinvaladt.c: remove msgnumLock, use atomic operations on maxMsgNum) | 
| Date: | 2025-05-05 07:30:50 | 
| Message-ID: | 37dbaeaa-8b14-40dc-96e0-5eba595f0f0a@postgrespro.ru | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
21.03.2025 19:33, Andres Freund пишет:
> Hi,
> 
> On 2025-03-21 14:35:16 +0300, Yura Sokolov wrote:
>> From 080c9e0de5e6e10751347e1ff50b65df424744cb Mon Sep 17 00:00:00 2001
>> From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
>> Date: Mon, 3 Feb 2025 11:58:33 +0300
>> Subject: [PATCH v2] sinvaladt.c: use atomic operations on maxMsgNum
>>
>> msgnumLock spinlock could be highly contended.
>> Comment states it was used as memory barrier.
>> Lets use atomic ops with memory barriers directly instead.
>>
>> Note: patch uses pg_read_barrier()/pg_write_barrier() instead of
>> pg_atomic_read_membarrier_u32()/pg_atomic_write_membarrier_u32() since
>> no full barrier semantic is required, and explicit read/write barriers
>> are cheaper at least on x86_64.
> 
> Is it actually true that full barriers aren't required?  I think we might
> actually rely on a stronger ordering.
> 
> 
>> @@ -506,10 +493,9 @@ SIGetDataEntries(SharedInvalidationMessage *data, int datasize)
>>  	 */
>>  	stateP->hasMessages = false;
>>
>> -	/* Fetch current value of maxMsgNum using spinlock */
>> -	SpinLockAcquire(&segP->msgnumLock);
>> -	max = segP->maxMsgNum;
>> -	SpinLockRelease(&segP->msgnumLock);
>> +	/* Fetch current value of maxMsgNum using atomic with memory barrier */
>> +	max = pg_atomic_read_u32(&segP->maxMsgNum);
>> +	pg_read_barrier();
>>
>>  	if (stateP->resetState)
>>  	{
>> 		/*
>> 		 * Force reset.  We can say we have dealt with any messages added
>> 		 * since the reset, as well; and that means we should clear the
>> 		 * signaled flag, too.
>> 		 */
>> 		stateP->nextMsgNum = max;
>> 		stateP->resetState = false;
>> 		stateP->signaled = false;
>> 		LWLockRelease(SInvalReadLock);
>> 		return -1;
>> 	}
> 
> vs
> 
>> @@ -410,17 +398,16 @@ SIInsertDataEntries(const SharedInvalidationMessage *data, int n)
>>  		/*
>>  		 * Insert new message(s) into proper slot of circular buffer
>>  		 */
>> -		max = segP->maxMsgNum;
>> +		max = pg_atomic_read_u32(&segP->maxMsgNum);
>>  		while (nthistime-- > 0)
>>  		{
>>  			segP->buffer[max % MAXNUMMESSAGES] = *data++;
>>  			max++;
>>  		}
>>
>> -		/* Update current value of maxMsgNum using spinlock */
>> -		SpinLockAcquire(&segP->msgnumLock);
>> -		segP->maxMsgNum = max;
>> -		SpinLockRelease(&segP->msgnumLock);
>> +		/* Update current value of maxMsgNum using atomic with memory barrier */
>> +		pg_write_barrier();
>> +		pg_atomic_write_u32(&segP->maxMsgNum, max);
>>
>>  		/*
>>  		 * Now that the maxMsgNum change is globally visible, we give everyone
>> 		 * a swift kick to make sure they read the newly added messages.
>> 		 * Releasing SInvalWriteLock will enforce a full memory barrier, so
>> 		 * these (unlocked) changes will be committed to memory before we exit
>> 		 * the function.
>> 		 */
>> 		for (i = 0; i < segP->numProcs; i++)
>> 		{
>> 			ProcState  *stateP = &segP->procState[segP->pgprocnos[i]];
>>
>> 			stateP->hasMessages = true;
>> 		}
> 
> On a loosely ordered architecture, the hasMessage=false in SIGetDataEntries()
> could be reordered with the read of maxMsgNum. Which, afaict, would lead to
> missing messages. That's not prevented by the pg_write_barrier() in
> SIInsertDataEntries().  I think there may be other similar dangers.
> 
> This could be solved by adding full memory barriers in a few places.
It is quite interesting remark, because SpinLockAcquire may be implemented
as `__sync_lock_test_and_set(lock, 1)` [1] on ARM/ARM64, and
`__sync_lock_test_and_set` has only "acquire barrier" semantic as GCC's
documentation says [2] . More over, `__sync_lock_release` used in this case
also provides only "release barrier" semantic.
Therefore, concerning these facts, current code state also doesn't give
guarantee `stateP->hasMessage = false` will not be reordered with `max =
segP->maxMsgNum`. Nor following read of messages are guaranteed to happen
after `max = segP->maxMsgNum`.
Given your expectations for SpinLockAcquire and SpinLockRelease to be full
memory barriers, and probably numerous usages of them as such, their
current implementation on ARM/ARM64 looks to be completely broken, imho.
-- 
regards
Yura Sokolov aka funny-falcon
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Laurenz Albe | 2025-05-05 07:31:00 | Re: PATCH: pg_dump to support "on conflict do update" | 
| Previous Message | Amit Kapila | 2025-05-05 06:18:35 | Re: Add an option to skip loading missing publication to avoid logical replication failure |