Re: Walsender may fail to send wal to the end.

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: michael(at)paquier(dot)xyz, andres(at)anarazel(dot)de, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Walsender may fail to send wal to the end.
Date: 2021-03-29 15:41:32
Message-ID: 20210329154132.GI20766@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Kyotaro Horiguchi (horikyota(dot)ntt(at)gmail(dot)com) wrote:
> At Mon, 29 Mar 2021 14:47:33 +0900, Michael Paquier <michael(at)paquier(dot)xyz> wrote in
> > On Fri, Mar 26, 2021 at 10:16:40AM -0700, Andres Freund wrote:
> > > On 2021-03-26 18:20:14 +0900, Kyotaro Horiguchi wrote:
> > > > This is because XLogSendPhysical detects removal of the wal segment
> > > > currently reading by shutdown checkpoint. However, there' no fear of
> > > > overwriting of WAL segments at the time.
> > > >
> > > > So I think we can omit the call to CheckXLogRemoved() while
> > > > MyWalSnd->state is WALSNDSTTE_STOPPING because the state comes after
> > > > the shutdown checkpoint completes.
> > > >
> > > > Of course that doesn't help if walsender was running two segments
> > > > behind. There still could be a small window for the failure. But it's
> > > > a great help to save the case of just 1 segment behind.
> > >
> > > -1. This seems like a bandaid to make a broken configuration work a tiny
> > > bit better, without actually being meaningfully better.
> >
> > Agreed. Still, wouldn't it be better to avoid such configurations and
> > protect a bit things with a check on the new value?

I have a hard time agreeing that this is somehow a 'broken'
configuration, instead it looks like a race condition that wasn't
considered and should be addressed. If there's zero lag then we really
should allow the final WAL to get sent to the replica.

> The repro was a bit artificial but the symptom happened without
> pg_switch_wal() and no load. It caused just by shutting down of
> primary. If it is normal behavior for walsenders to fail to send the
> last shutdown record to standby while fast shutdown, we should refuse
> to startup at least wal sender if wal_keep_size = 0.
>
> I can guess two ways to do that.

Both of which will break things for people, so this certainly isn't a
great approach, and besides, if archiving is happening with
archive_command and the replica has a restore command then it should be
able to follow that just fine, no? So we'd have to also check if
archive_command has been set up and hope the admin has a restore
command. Having to go through that dance instead of just making sure to
push out the last WAL to the replica seems a bit silly though.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2021-03-29 15:50:20 Re: Idea: Avoid JOINs by using path expressions to follow FKs
Previous Message Alvaro Herrera 2021-03-29 15:16:57 Re: Rename of triggers for partitioned tables