Re: base backup client as auxiliary backend process

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: base backup client as auxiliary backend process
Date: 2019-11-07 04:16:30
Message-ID: 20191107041630.GK1768@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 28, 2019 at 09:30:52AM +0100, Peter Eisentraut wrote:
> On 2019-09-18 10:31, Michael Paquier wrote:
>> - * Verify XLOG status looks valid.
>> + * Check that contents look valid.
>> */
>> - if (ControlFile->state < DB_SHUTDOWNED ||
>> - ControlFile->state > DB_IN_PRODUCTION ||
>> - !XRecOffIsValid(ControlFile->checkPoint))
>> + if (!XRecOffIsValid(ControlFile->checkPoint))
>> ereport(FATAL,
>> Doesn't seem like a good idea to me to remove this sanity check for
>> normal deployments, but actually you moved that down in StartupXLOG().
>> It seems to me tha this is unrelated and could be a separate patch so
>> as the errors produced are more verbose. I think that we should also
>> change that code to use a switch/case on ControlFile->state.
>
> Done. Yes, this was really a change made to get more precise error messaged
> during debugging. It could be committed separately.

If you wish to do so now, that's fine by me.

>> The current defaults of pg_basebackup have been thought so as the
>> backups taken have a good stability and so as monitoring is eased
>> thanks to --wal-method=stream and the use of replication slots.
>> Shouldn't the use of a least a temporary replication slot be mandatory
>> for the stability of the copy? It seems to me that there is a good
>> argument for having a second process which streams WAL on top of the
>> main backup process, and just use a WAL receiver for that.
>
> Is this something that the walreceiver should be doing independent of this
> patch?

There could be an argument for switching a WAL receiver to use a
temporary replication slot by default. Still, it seems to me that
this backup solution suffers from the same set of problems we have
spent years to fix with pg_basebackup with missing WAL files caused by
concurrent checkpoints removing things needed while the copy of the
main data folder and other tablespaces happens.

>> One problem which is not tackled here is what to do for the tablespace
>> map. pg_basebackup has its own specific trick for that, and with that
>> new feature we may want something equivalent? Not something to
>> consider as a first stage of course.
>
> The updated has support for tablespaces without mapping. I'm thinking about
> putting the mapping specification into a GUC list somehow. Shouldn't be too
> hard.

That may become ugly if there are many tablespaces to take care of.
Another idea I can come up with would be to pass the new mapping to
initdb, still this requires an extra intermediate step to store the
new map, and then compare it with the mapping received at BASE_BACKUP
time. But perhaps you are looking for an experience different than
pg_basebackup. The first version of the patch does not actually
require that anyway..

>> No more strtol with base 10 stuff please :)
>
> Hmm, why not? What's the replacement?

I was referring to this patch:
https://commitfest.postgresql.org/25/2272/
It happens that all our calls of strtol in core use a base of 10. But
please just ignore this part.

ReceiveAndUnpackTarFile() is in both libpqwalreceiver.c and
pg_basebackup.c. It would be nice to refactor that.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2019-11-07 04:25:16 Re: Reorderbuffer crash during recovery
Previous Message Michael Paquier 2019-11-07 03:55:13 Re: Problem during Windows service start