Re: Minimal logical decoding on standbys

From: Andres Freund <andres(at)anarazel(dot)de>
To: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>
Cc: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, fabriziomello(at)gmail(dot)com, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, "[pgdg] Robert Haas" <robertmhaas(at)gmail(dot)com>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Minimal logical decoding on standbys
Date: 2021-08-02 16:01:33
Message-ID: 20210802160133.uugcce5ql4m5mv5m@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-08-02 16:45:23 +0200, Drouvot, Bertrand wrote:
> On 7/27/21 7:22 PM, Andres Freund wrote:
> > On 2021-07-27 09:23:48 +0200, Drouvot, Bertrand wrote:
> > > Thanks for the warning, rebase done and new v21 version attached.
> > Did you have a go at fixing the walsender race conditions I
> > (re-)discovered? Without fixing those I don't see this patch going in...
>
> Those new patches should be addressing all your previous code and TAP tests
> remarks, except those 2 for which I would need your input:
>
> 1. The first one is linked to your remarks:
> "
>
> > While working on this I found a, somewhat substantial, issue:
> >
> > When the primary is idle, on the standby logical decoding via walsender
> > will typically not process the records until further WAL writes come in
> > from the primary, or until a 10s lapsed.
> >
> > The problem is that WalSndWaitForWal() waits for the *replay* LSN to
> > increase, but gets woken up by walreceiver when new WAL has been
> > flushed. Which means that typically walsenders will get woken up at the
> > same time that the startup process will be - which means that by the
> > time the logical walsender checks GetXLogReplayRecPtr() it's unlikely
> > that the startup process already replayed the record and updated
> > XLogCtl->lastReplayedEndRecPtr.
> >
> > I think fixing this would require too invasive changes at this point. I
> > think we might be able to live with 10s delay issue for one release, but
> > it sure is ugly :(.
>
> This is indeed pretty painful. It's a lot more regularly occuring if you
> either have a slot disk, or you switch around the order of
> WakeupRecovery() and WalSndWakeup() XLogWalRcvFlush().
>
> "
>
> Is that what you are referring to as the “walsender race conditions”?

Yes.

> If so, do you already have in mind a way to handle this? (I thought you
> already had in mind a way to handle it so the question)

Yes. I think we need to add a condition variable to be able to wait for
WAL positions to change. Either multiple condition variables (one for
the flush position, one for the replay position), or one that just
changes more often. That way one can wait for apply without a race
condition.

> 2. The second one is linked to your remark:
>
> "There's also no test 
for a recovery conflict due to row removal"
>
> Don't you think that the 
"vacuum full" conflict test is enough?

It's not. It'll cause conflicts due to exclusive locks etc.

> if not, what kind of additional 
tests would you like to see?

A few catalog rows being removed (e.g. due to DELETE and then VACUUM
*without* full) and a standby without hot_standby_feedback catching
that.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-08-02 16:41:24 straightening out backend process startup
Previous Message vignesh C 2021-08-02 15:56:01 Re: [PATCH]Comment improvement in publication.sql