Re: Create replication slot in pg_basebackup if requested and not yet present

From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Create replication slot in pg_basebackup if requested and not yet present
Date: 2017-09-11 07:11:57
Message-ID: 20170911071157.GB4750@nighthawk.caipicrew.dd-dns.de
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Sep 08, 2017 at 10:30:20AM -0700, Jeff Janes wrote:
> On Wed, Sep 6, 2017 at 9:22 AM, Peter Eisentraut <
> peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
>
> > On 8/18/17 05:28, Michael Banck wrote:
> > >>> Rebased, squashed and slighly edited version attached. I've added this
> > >>> to the 2017-07 commitfest now as well:
> > >>>
> > >>> https://commitfest.postgresql.org/14/1112/
> > >> Can you rebase this past some conflicting changes?
> > > Thanks for letting me know, PFA a rebased version.
> >
> > I have reviewed the thread so far. I think there is general agreement
> > that something like this would be good to have.
> >
> > I have not found any explanation, however, why the "if not exists"
> > behavior is desirable, let alone as the default. I can only think of
> > two workflows here: Either you have scripts for previous PG versions
> > that create the slot externally, in which can you omit --create, or you
> > use the new functionality to have pg_basebackup create the slot. I
> > don't see any use for pg_basebackup to opportunistically use a slot if
> > it happens to exist. Even if there is one, it should not be the
> > default. So please change that.
>
> +1. I'd rather just get an error and re-run without the --create switch.
> That way you are forced to think about what you are doing.

OK.

> Is there a race condition here? The slot is created after the checkpoint
> is completed. But you have to start streaming from the LSN where the
> checkpoint started, so shouldn't the slot be created before the checkpoint
> is started?

So my patch only moves the slot creation slightly further forward,
AFAICT.

AIUI, wal streaming always begins at last checkpoint and from my tests
the restart_lsn of the created replication slot is also before that
checkpoint's lsn. However, I hope somebody more familiar with the
WAL/replication slot code could comment on that. What I dropped in the
refactoring is the RESERVE_WAL that used to be there when the temporary
slot gets created, I have readded that now.

I also added a TAP test case that tries to check that the restart_lsn is
lower than the checkpoint_lsn, which appears to be the case.

If there is still a race condition here, do you have a suggestion in how
to try to trigger it?

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Attachment Content-Type Size
0001-Add-option-to-create-a-replication-slot-in-pg_baseba-v5.patch text/x-diff 12.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-09-11 07:13:29 Re: Setting pd_lower in GIN metapage
Previous Message Rafia Sabih 2017-09-11 07:10:35 Re: [POC] Faster processing at Gather node