From: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pg_atomic_compare_exchange_*() and memory barriers |
Date: | 2025-03-07 17:08:34 |
Message-ID: | CALT9ZEHDC1jz3z54hJ2_PvL+6D+u4niXGVxcBvbhHaNeGOXOhw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi, Alexander and Andres!
On Fri, 7 Mar 2025 at 20:40, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>
> Hi, Andres.
>
> Thank you for reply.
>
> On Fri, Mar 7, 2025 at 6:02 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > On 2025-03-07 17:47:08 +0200, Alexander Korotkov wrote:
> > > While investigating a bug in the patch to get rid of WALBufMappingLock, I
> > > found that the surrounding pg_atomic_compare_exchange_u64() fixes the
> > > problem for me.
> >
> > That sounds more likely to be due to slowing down things enough to make a race
> > less likely to be hit. Or a compiler bug.
> >
> >
> > > That doesn't feel right because, according to the
> > > comments, both pg_atomic_compare_exchange_u32() and
> > > pg_atomic_compare_exchange_u64() should provide full memory barrier
> > > semantics. So, I decided to investigate this further.
> > >
> > > In my case, the implementation of pg_atomic_compare_exchange_u64() is based
> > > on __atomic_compare_exchange_n().
> > >
> > > static inline bool
> > > pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
> > > uint64 *expected, uint64 newval)
> > > {
> > > AssertPointerAlignment(expected, 8);
> > > return __atomic_compare_exchange_n(&ptr->value, expected, newval, false,
> > > __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
> > > }
> > >
> > > According to the docs __ATOMIC_SEQ_CST enforces total ordering with *all
> > > other __ATOMIC_SEQ_CST operations*. It only says about other
> > > __ATOMIC_SEQ_CST operations; nothing is said about regular reads/writes.
> > > This sounds quite far from our comment, promising full barrier semantics,
> > > which, in my understanding, means the equivalent of
> > > both pg_read_barrier()/pg_write_barrier(), which should barrier *all other
> > > read/writes*.
> >
> > A memory barrier in one process/thread always needs be paired with a barrier
> > in another process/thread. It's not enough to have a memory barrier on one
> > side but not the other.
>
> Yes, there surely should be a memory barrier on another side. But
> does __ATOMIC_SEQ_CST works as a barrier for the regular read/write
> operations on the same side?
According to a reference posted by Andres [1]:
"Within a thread of execution, accesses (reads and writes) through
volatile lvalues cannot be reordered past observable side-effects
(including other volatile accesses) that are separated by a sequence
point within the same thread, but this order is not guaranteed to be
observed by another thread, since volatile access does not establish
inter-thread synchronization."
Also: "as soon as atomic operations that are not tagged
memory_order_seq_cst enter the picture, the sequential consistency is
lost"
So I think Alexander is right in the original post. The order of
everything not marked as ATOMIC_SEQ_CST (atomic or non-atomic
operations) compared to atomic operations marked by ATOMIC_SEQ_CST is
not warranted.
> > __ATOMIC_SEQ_CST is used to implement the C11/C++11 barriers, and for those
> > it's more clearly formulated that that include acquire/release semantics. The
> > standard formulation is a bit more complicated IIRC, but here's the
> > cppreference.com simplification:
> >
> > https://en.cppreference.com/w/c/atomic/memory_order
> > > A load operation with this memory order performs an acquire operation, a
> > > store performs a release operation, and read-modify-write performs both an
> > > acquire operation and a release operation, plus a single total order exists
> > > in which all threads observe all modifications in the same order (see
> > > Sequentially-consistent ordering below).
>
>
> Thank you. This is not yet clear for me. I probably need to meditate
> more on this :)
I think __ATOMIC_ACQ_REL is needed for success, and __ATOMIC_ACQUIRE
might be sufficient for failure as failure of
__atomic_compare_exchange_n contains no modification of atomic
variable, so we want to see all modifications from the concurrent
operations, but will not enforce them to see any "our" change. [2]
Maybe I'm too optimistic here and having a full memory barrier is
better.
Thank you very much for considering this issue! I think the right
operation of atomic primitives is both very very important and also
hard to check thoroughly. I think the test that could reproduce the
reordering of atomic and non-atomic operations could be very useful in
regression set.
[1] https://en.cppreference.com/w/c/atomic/memory_order#Sequentially-consistent_ordering
[2] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
Best regards,
Pavel Borisov
Supabase
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2025-03-07 17:15:23 | Re: pg_atomic_compare_exchange_*() and memory barriers |
Previous Message | Andres Freund | 2025-03-07 17:07:56 | Re: pg_atomic_compare_exchange_*() and memory barriers |