Re: valgrind versus pg_atomic_init()

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: valgrind versus pg_atomic_init()
Date: 2020-06-15 01:55:27
Message-ID: 20200615015527.GA2850107@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 07, 2020 at 12:23:35AM -0400, Tom Lane wrote:
> I experimented with running "make check" on ARM64 under a reasonably
> bleeding-edge valgrind (3.16.0). One thing I ran into is that
> regress.c's test_atomic_ops fails; valgrind shows the stack trace
>
> fun:__aarch64_cas8_acq_rel
> fun:pg_atomic_compare_exchange_u64_impl
> fun:pg_atomic_exchange_u64_impl
> fun:pg_atomic_write_u64_impl
> fun:pg_atomic_init_u64_impl
> fun:pg_atomic_init_u64
> fun:test_atomic_uint64
> fun:test_atomic_ops
> fun:ExecInterpExpr
>
> Now, this is basically the same thing as is already memorialized in
> src/tools/valgrind.supp:
>
> # Atomic writes to 64bit atomic vars uses compare/exchange to
> # guarantee atomic writes of 64bit variables. pg_atomic_write is used
> # during initialization of the atomic variable; that leads to an
> # initial read of the old, undefined, memory value. But that's just to
> # make sure the swap works correctly.
> {
> uninitialized_atomic_init_u64
> Memcheck:Cond
> fun:pg_atomic_exchange_u64_impl
> fun:pg_atomic_write_u64_impl
> fun:pg_atomic_init_u64_impl
> }
>
> so my first thought was that we just needed an architecture-specific
> variant of that. But on thinking more about this, it seems like
> generic.h's version of pg_atomic_init_u64_impl is just fundamentally
> misguided. Why isn't it simply assigning the value with an ordinary
> unlocked write? By definition, we had better not be using this function
> in any circumstance where there might be conflicting accesses

Does something guarantee the write will be globally-visible by the time the
first concurrent accessor shows up? (If not, one could (a) do an unlocked
ptr->value=0, then the atomic write, or (b) revert and improve the
suppression.) I don't doubt it's fine for the ways PostgreSQL uses atomics
today, which generally initialize an atomic before the concurrent-accessor
processes even exist.

> , so I don't
> see why we should need to tolerate a valgrind exception here. Moreover,
> if a simple assignment *isn't* good enough, then surely the spinlock
> version in atomics.c is 100% broken.

Are you saying it would imply a bug in atomics.c pg_atomic_init_u64_impl(),
pg_atomic_compare_exchange_u64_impl(), or pg_atomic_fetch_add_u64_impl()? Can
you explain that more? If you were referring to unlocked "*(lock) = 0", that
is different since it's safe to have a delay in propagation of the change from
locked state to unlocked state.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-06-15 02:07:45 Re: Postgresql13_beta1 (could not rename temporary statistics file) Windows 64bits
Previous Message vignesh C 2020-06-15 01:20:10 Re: Include access method in listTables output