Re: pg_atomic_compare_exchange_*() and memory barriers

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_atomic_compare_exchange_*() and memory barriers
Date: 2025-03-07 17:08:34
Message-ID: CALT9ZEHDC1jz3z54hJ2_PvL+6D+u4niXGVxcBvbhHaNeGOXOhw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Alexander and Andres!

On Fri, 7 Mar 2025 at 20:40, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>
> Hi, Andres.
>
> Thank you for reply.
>
> On Fri, Mar 7, 2025 at 6:02 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > On 2025-03-07 17:47:08 +0200, Alexander Korotkov wrote:
> > > While investigating a bug in the patch to get rid of WALBufMappingLock, I
> > > found that the surrounding pg_atomic_compare_exchange_u64() fixes the
> > > problem for me.
> >
> > That sounds more likely to be due to slowing down things enough to make a race
> > less likely to be hit. Or a compiler bug.
> >
> >
> > > That doesn't feel right because, according to the
> > > comments, both pg_atomic_compare_exchange_u32() and
> > > pg_atomic_compare_exchange_u64() should provide full memory barrier
> > > semantics. So, I decided to investigate this further.
> > >
> > > In my case, the implementation of pg_atomic_compare_exchange_u64() is based
> > > on __atomic_compare_exchange_n().
> > >
> > > static inline bool
> > > pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
> > > uint64 *expected, uint64 newval)
> > > {
> > > AssertPointerAlignment(expected, 8);
> > > return __atomic_compare_exchange_n(&ptr->value, expected, newval, false,
> > > __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
> > > }
> > >
> > > According to the docs __ATOMIC_SEQ_CST enforces total ordering with *all
> > > other __ATOMIC_SEQ_CST operations*. It only says about other
> > > __ATOMIC_SEQ_CST operations; nothing is said about regular reads/writes.
> > > This sounds quite far from our comment, promising full barrier semantics,
> > > which, in my understanding, means the equivalent of
> > > both pg_read_barrier()/pg_write_barrier(), which should barrier *all other
> > > read/writes*.
> >
> > A memory barrier in one process/thread always needs be paired with a barrier
> > in another process/thread. It's not enough to have a memory barrier on one
> > side but not the other.
>
> Yes, there surely should be a memory barrier on another side. But
> does __ATOMIC_SEQ_CST works as a barrier for the regular read/write
> operations on the same side?
According to a reference posted by Andres [1]:
"Within a thread of execution, accesses (reads and writes) through
volatile lvalues cannot be reordered past observable side-effects
(including other volatile accesses) that are separated by a sequence
point within the same thread, but this order is not guaranteed to be
observed by another thread, since volatile access does not establish
inter-thread synchronization."

Also: "as soon as atomic operations that are not tagged
memory_order_seq_cst enter the picture, the sequential consistency is
lost"

So I think Alexander is right in the original post. The order of
everything not marked as ATOMIC_SEQ_CST (atomic or non-atomic
operations) compared to atomic operations marked by ATOMIC_SEQ_CST is
not warranted.

> > __ATOMIC_SEQ_CST is used to implement the C11/C++11 barriers, and for those
> > it's more clearly formulated that that include acquire/release semantics. The
> > standard formulation is a bit more complicated IIRC, but here's the
> > cppreference.com simplification:
> >
> > https://en.cppreference.com/w/c/atomic/memory_order
> > > A load operation with this memory order performs an acquire operation, a
> > > store performs a release operation, and read-modify-write performs both an
> > > acquire operation and a release operation, plus a single total order exists
> > > in which all threads observe all modifications in the same order (see
> > > Sequentially-consistent ordering below).
>
>
> Thank you. This is not yet clear for me. I probably need to meditate
> more on this :)

I think __ATOMIC_ACQ_REL is needed for success, and __ATOMIC_ACQUIRE
might be sufficient for failure as failure of
__atomic_compare_exchange_n contains no modification of atomic
variable, so we want to see all modifications from the concurrent
operations, but will not enforce them to see any "our" change. [2]
Maybe I'm too optimistic here and having a full memory barrier is
better.

Thank you very much for considering this issue! I think the right
operation of atomic primitives is both very very important and also
hard to check thoroughly. I think the test that could reproduce the
reordering of atomic and non-atomic operations could be very useful in
regression set.

[1] https://en.cppreference.com/w/c/atomic/memory_order#Sequentially-consistent_ordering
[2] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

Best regards,
Pavel Borisov
Supabase

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2025-03-07 17:15:23 Re: pg_atomic_compare_exchange_*() and memory barriers
Previous Message Andres Freund 2025-03-07 17:07:56 Re: pg_atomic_compare_exchange_*() and memory barriers