From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Alexander Lakhin <exclusion(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, greg(at)burd(dot)me |
Subject: | Re: IO in wrong state on riscv64 |
Date: | 2025-10-22 21:51:26 |
Message-ID: | CA+hUKGLJHW7QjNGpfo+yKD6GzhdVnHxaSf0QJZp4VTu7jAt68A@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Oct 22, 2025 at 11:51 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> Have you figured out if the LLVM bug is specific to riscv or is more general?
> Afaict most of the MachineSink stuff mentioned in the llvm bug is generic, not
> arch specific. But I'd be surprised if such a problem in a more widely used
> arch hadn't been hit long ago...
It also happens on MIPS, Loongarch and x86 (I think?), but not POWER
with that godbolt link I posted.
I wonder if the direction this will go is that it's actually UB: the
C11 memory model only defines fences for atomic operations in section
7.17.4. I think that's basically what this says, and perhaps it's
also showing what that mention of mayStore is about:
https://llvm.org/docs/Atomics.html#optimization-outside-atomic
Surprisingly, even __sync_synchronize() doesn't seem to prevent Clang
from sinking that load instruction. That led me to an interesting old
discussion:
https://discourse.llvm.org/t/intended-semantics-for-fence-seq-cst/28591/4
"... it is not an explicit goal of LLVM to
support or continue supporting legacy code that did what it had to to
express functional concurrent code. It may happen to work now, but
unless enough LLVM users express interest this may break one day, and
__sync_synchronize may not order anything (it may just emit a fence
without forcing any ordering)."
"... trying to make user code that doesn't use
volatile or atomic, but does use __sync_synchronize, work as intended
isn't one of LLVM's goals."
Hngghhghh...
On a more practical note, Alexander, does this work?
#if !defined(pg_read_barrier_impl) && defined(HAVE_GCC__ATOMIC_INT32_CAS)
/* acquire semantics include read barrier semantics */
-# define pg_read_barrier_impl()
__atomic_thread_fence(__ATOMIC_ACQUIRE)
+# define pg_read_barrier_impl() do {
__atomic_signal_fence(__ATOMIC_ACQUIRE);
__atomic_thread_fence(__ATOMIC_ACQUIRE); } while (0)
#endif
Someone might want to write an arch-riscv.h that spits out the asm
instead (should be just FENCE R, R and FENCE W, W as far as I can see,
which might be slightly better than what acquire/release generate, on
paper at least), but apparently this problem is wider than RISCV so we
need a general solution anyway.
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2025-10-22 21:54:18 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Previous Message | Nathan Bossart | 2025-10-22 21:48:34 | Re: fix type of infomask parameter in static inline functions |