Re: WAL Insertion Lock Improvements (was: Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish())

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WAL Insertion Lock Improvements (was: Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish())
Date: 2023-01-24 13:30:00
Message-ID: CALj2ACUc=Hcd3X+mBNAF2o9Y3r=cxVoz4JBayBTz=jw74KopHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 6, 2022 at 12:00 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi

Thanks for reviewing.

> FWIW, I don't see an advantage in 0003. If it allows us to make something else
> simpler / faster, cool, but on its own it doesn't seem worthwhile.

I've discarded this change.

> On 2022-12-02 16:31:58 -0800, Nathan Bossart wrote:
> > On Fri, Dec 02, 2022 at 04:32:38PM +0530, Bharath Rupireddy wrote:
> > > On Fri, Dec 2, 2022 at 6:10 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > >> I'm not sure this is quite right - don't we need a memory barrier. But I don't
> > >> see a reason to not just leave this code as-is. I think this should be
> > >> optimized entirely in lwlock.c
> > >
> > > Actually, we don't need that at all as LWLockWaitForVar() will return
> > > immediately if the lock is free. So, I removed it.
> >
> > I briefly looked at the latest patch set, and I'm curious how this change
> > avoids introducing memory ordering bugs. Perhaps I am missing something
> > obvious.
>
> I'm a bit confused too - the comment above talks about LWLockWaitForVar(), but
> the patches seem to optimize LWLockUpdateVar().
>
> I think it'd be safe to optimize LWLockConflictsWithVar(), due to some
> pre-existing, quite crufty, code. LWLockConflictsWithVar() says:
>
> * Test first to see if it the slot is free right now.
> *
> * XXX: the caller uses a spinlock before this, so we don't need a memory
> * barrier here as far as the current usage is concerned. But that might
> * not be safe in general.
>
> which happens to be true in the single, indirect, caller:
>
> /* Read the current insert position */
> SpinLockAcquire(&Insert->insertpos_lck);
> bytepos = Insert->CurrBytePos;
> SpinLockRelease(&Insert->insertpos_lck);
> reservedUpto = XLogBytePosToEndRecPtr(bytepos);
>
> I think at the very least we ought to have a comment in
> WaitXLogInsertionsToFinish() highlighting this.

Done.

> It's not at all clear to me that the proposed fast-path for LWLockUpdateVar()
> is safe. I think at the very least we could end up missing waiters that we
> should have woken up.
>
> I think it ought to be safe to do something like
>
> pg_atomic_exchange_u64()..
> if (!(pg_atomic_read_u32(&lock->state) & LW_FLAG_HAS_WAITERS))
> return;
>
> because the pg_atomic_exchange_u64() will provide the necessary memory
> barrier.

Done.

I'm attaching the v3 patch with the above review comments addressed.
Hopefully, no memory ordering issues now. FWIW, I've added it to CF
https://commitfest.postgresql.org/42/4141/.

Test results with the v3 patch and insert workload are the same as
that of the earlier run - TPS starts to scale at higher clients as
expected after 512 clients and peaks at 2X with 2048 and 4096 clients.

HEAD:
1 1380.411086
2 1358.378988
4 2701.974332
8 5925.380744
16 10956.501237
32 20877.513953
64 40838.046774
128 70251.744161
256 108114.321299
512 120478.988268
768 99140.425209
1024 93645.984364
2048 70111.159909
4096 55541.804826

v3 PATCHED:
1 1493.800209
2 1569.414953
4 3154.186605
8 5965.578904
16 11912.587645
32 22720.964908
64 42001.094528
128 78361.158983
256 110457.926232
512 148941.378393
768 167256.590308
1024 155510.675372
2048 147499.376882
4096 119375.457779

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v3-0001-Optimize-WAL-insertion-lock-acquisition-and-relea.patch application/octet-stream 9.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Geier 2023-01-24 13:30:34 Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
Previous Message Dean Rasheed 2023-01-24 13:10:03 to_hex() for negative inputs