Re: WAL Insertion Lock Improvements

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WAL Insertion Lock Improvements
Date: 2023-07-25 22:39:37
Message-ID: ZMBPKbQ/wQZ9k25w@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 25, 2023 at 12:57:37PM -0700, Andres Freund wrote:
> I just rebased my aio tree over the commit and promptly, on the first run, saw
> a hang. I did some debugging on that. Unfortunately repeated runs haven't
> repeated that hang, despite quite a bit of trying.
>
> The symptom I was seeing is that all running backends were stuck in
> LWLockWaitForVar(), even though the value they're waiting for had
> changed. Which obviously "shouldn't be possible".

Hmm. I've also spent a few days looking at this past report that made
the LWLock part what it is today, but I don't quite see immediately
how it would be possible to reach a state where all the backends are
waiting for an update that's not happening:
https://www.postgresql.org/message-id/CAMkU=1zLztROwH3B42OXSB04r9ZMeSk3658qEn4_8+b+K3E7nQ@mail.gmail.com

All the assumptions of this code and its dependencies with
xloginsert.c are hard to come by.

> It's of course possible that this is AIO specific, but I didn't see anything
> in stacks to suggest that.

Or AIO handles the WAL syncs so quickly that it has more chances in
showing a race condition here?

> I do wonder if this possibly exposed an undocumented prior dependency on the
> value update always happening under the list lock.

I would not be surprised by that.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-07-25 22:40:31 Re: WAL Insertion Lock Improvements
Previous Message Tomas Vondra 2023-07-25 21:12:53 Re: logical decoding and replication of sequences, take 2