Re: WAL Insertion Lock Improvements (was: Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish())

From: Andres Freund <andres(at)anarazel(dot)de>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WAL Insertion Lock Improvements (was: Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish())
Date: 2023-01-09 22:24:32
Message-ID: 20230109222432.mercsr67lkoz67rw@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-12-08 12:29:54 +0530, Bharath Rupireddy wrote:
> On Tue, Dec 6, 2022 at 12:00 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > I think it'd be safe to optimize LWLockConflictsWithVar(), due to some
> > pre-existing, quite crufty, code. LWLockConflictsWithVar() says:
> >
> > * Test first to see if it the slot is free right now.
> > *
> > * XXX: the caller uses a spinlock before this, so we don't need a memory
> > * barrier here as far as the current usage is concerned. But that might
> > * not be safe in general.
> >
> > which happens to be true in the single, indirect, caller:
> >
> > /* Read the current insert position */
> > SpinLockAcquire(&Insert->insertpos_lck);
> > bytepos = Insert->CurrBytePos;
> > SpinLockRelease(&Insert->insertpos_lck);
> > reservedUpto = XLogBytePosToEndRecPtr(bytepos);
> >
> > I think at the very least we ought to have a comment in
> > WaitXLogInsertionsToFinish() highlighting this.
>
> So, using a spinlock ensures no memory ordering occurs while reading
> lock->state in LWLockConflictsWithVar()?

No, a spinlock *does* imply ordering. But your patch does remove several of
the spinlock acquisitions (via LWLockWaitListLock()). And moved the assignment
in LWLockUpdateVar() out from under the spinlock.

If you remove spinlock operations (or other barrier primitives), you need to
make sure that such modifications don't break the required memory ordering.

> How does spinlock that gets acquired and released in the caller
> WaitXLogInsertionsToFinish() itself and the memory barrier in the called
> function LWLockConflictsWithVar() relate here? Can you please help me
> understand this a bit?

The caller's barrier means that we'll see values that are at least as "up to
date" as at the time of the barrier (it's a bit more complicated than that, a
barrier always needs to be paired with another barrier).

> > It's not at all clear to me that the proposed fast-path for LWLockUpdateVar()
> > is safe. I think at the very least we could end up missing waiters that we
> > should have woken up.
> >
> > I think it ought to be safe to do something like
> >
> > pg_atomic_exchange_u64()..
> > if (!(pg_atomic_read_u32(&lock->state) & LW_FLAG_HAS_WAITERS))
> > return;
>
> pg_atomic_exchange_u64(&lock->state, exchange_with_what_?. Exchange will
> change the value no?

Not lock->state, but the atomic passed to LWLockUpdateVar(), which we do want
to update. An pg_atomic_exchange_u64() includes a memory barrier.

> > because the pg_atomic_exchange_u64() will provide the necessary memory
> > barrier.
>
> I'm reading some comments [1], are these also true for 64-bit atomic
> CAS?

Yes. See
/* ----
* The 64 bit operations have the same semantics as their 32bit counterparts
* if they are available. Check the corresponding 32bit function for
* documentation.
* ----
*/

> Does it mean that an atomic CAS operation inherently provides a
> memory barrier?

Yes.

> Can you please point me if it's described better somewhere else?

I'm not sure what you'd like to have described more extensively, tbh.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Karl O. Pinc 2023-01-09 22:28:21 Re: doc: add missing "id" attributes to extension packaging page
Previous Message Peter Geoghegan 2023-01-09 22:18:21 Re: Reducing the WAL overhead of freezing in VACUUM by deduplicating per-tuple freeze plans