Re: Assertion failure in WaitForWALToBecomeAvailable state machine

From: Noah Misch <noah(at)leadboat(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Assertion failure in WaitForWALToBecomeAvailable state machine
Date: 2022-09-15 13:58:43
Message-ID: 20220915135843.GA1520945@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 13, 2022 at 11:56:16AM +0530, Bharath Rupireddy wrote:
> On Tue, Sep 13, 2022 at 8:52 AM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > > > > [1] - https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com
> >
> > I plan to use that message's patch, because it guarantees WALRCV_STOPPED at
> > the code location being changed. Today, in the unlikely event of
> > !WalRcvStreaming() due to WALRCV_WAITING or WALRCV_STOPPING, that code
> > proceeds without waiting for WALRCV_STOPPED.

Pushed that way.

> Hm. That was the original fix [2] proposed and it works. The concern
> is that XLogShutdownWalRcv() does a bunch of work via ShutdownWalRcv()
> - it calls ConditionVariablePrepareToSleep(),
> ConditionVariableCancelSleep() (has lock 2 acquisitions and
> requisitions) and 1 function call WalRcvRunning()) even for
> WALRCV_STOPPED case, all this is unnecessary IMO when we determine the
> walreceiver is state is already WALRCV_STOPPED.

That's fine. If we're reaching this code at high frequency, that implies
we're also forking walreceiver processes at high frequency. This code would
be a trivial part of the overall cost.

> > If WALRCV_WAITING or WALRCV_STOPPING can happen at that patch's code site, I
> > perhaps should back-patch the change to released versions. Does anyone know
> > whether one or both can happen?

If anyone discovers such cases later, we can extend the back-patch then.

> IMO, we must back-patch to the version where
> cc2c7d65fc27e877c9f407587b0b92d46cd6dd16 got introduced irrespective
> of any of the above happening.

Correct. The sentences were about *released* versions, not v15.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-09-15 14:22:27 Re: Can we avoid chdir'ing in resolve_symlinks() ?
Previous Message Tom Lane 2022-09-15 13:55:20 Re: Cleaning up historical portability baggage