Re: base backup client as auxiliary backend process

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: base backup client as auxiliary backend process
Date: 2020-02-05 04:52:05
Message-ID: CA+fd4k69vxun0utUeZsmBk=RRCUFa8zm+T0ORkXirEnyiTVkuw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 3 Feb 2020 at 20:06, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2020-01-11 10:52:30 +0100, Peter Eisentraut wrote:
> > On 2020-01-10 04:32, Masahiko Sawada wrote:
> > > I agreed that these patches are useful on its own and 0001 patch and
> >
> > committed 0001
>
> over on -committers Robert complained:
>
> On 2020-01-23 15:49:37 -0500, Robert Haas wrote:
> > On Tue, Jan 14, 2020 at 8:57 AM Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
> > > walreceiver uses a temporary replication slot by default
> > >
> > > If no permanent replication slot is configured using
> > > primary_slot_name, the walreceiver now creates and uses a temporary
> > > replication slot. A new setting wal_receiver_create_temp_slot can be
> > > used to disable this behavior, for example, if the remote instance is
> > > out of replication slots.
> > >
> > > Reviewed-by: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
> > > Discussion: https://www.postgresql.org/message-id/CA%2Bfd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V%2BnqZA%40mail.gmail.com
> >
> > Neither the commit message for this patch nor any of the comments in
> > the patch seem to explain why this is a desirable change.
> >
> > I assume that's probably discussed on the thread that is linked here,
> > but you shouldn't have to dig through the discussion thread to figure
> > out what the benefits of a change like this are.
>
> which I fully agree with.
>
>
> It's not at all clear to me that the potential downsides of this have
> been fully thought through. And even if they have, they've not been
> documented.
>
> Previously if a standby without a slot was slow receiving WAL,
> e.g. because the network bandwidth was insufficient, it'd at some point
> just fail because the required WAL is removed. But with this patch that
> won't happen - instead the primary will just run out of space. At the
> very least this would need to add documentation of this caveat to a few
> places.

+1 to add downsides to the documentation.

It might not normally happen but with this parameter we will need to
have enough setting of max_replication_slots because the standby will
fail to start after failover due to full of slots.

WAL required by the standby could be removed on the primary due to the
standby delaying much, for example when the standby stopped for a long
time or when the standby is running but delayed for some reason. This
feature prevents WAL from removal for the latter case. That is, we can
ensure that required WAL is not removed during replication running.
For the former case we can use a permanent replication slot. Although
there is a risk of running out of space but I personally think this
behavior is better for most cases.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-02-05 04:59:10 pgsql: Add kqueue(2) support to the WaitEventSet API.
Previous Message ideriha.takeshi@fujitsu.com 2020-02-05 04:50:32 RE: Global shared meta cache