Re: [BUG] pg_basebackup from disconnected standby fails

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG] pg_basebackup from disconnected standby fails
Date: 2016-07-19 05:13:36
Message-ID: CAB7nPqRXMkR2rq7_Wa1aZZqBFVRUwU_5QTBr_kEQqWGUEjAaAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 16, 2016 at 9:20 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Wed, Jul 13, 2016 at 8:56 AM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> If we want to tackle the case I mentioned above, one way is to just
>> update minRecoveryPoint when an exclusive or a non-exclusive backup is
>> being taken by looking at their status in shared memory. See for
>> example the patch (1) attached, but this does not save from the case
>> where a node replays WAL, does not have data flushes, and from which a
>> backup is taken, in the case where this node gets restarted later with
>> the immediate mode and has different replay targets.
>
> This looks clumsy as it updates minrecoverypoint for a specific
> condition and it doesn't solve the above mentioned inconcistency.

Yep. I am not saying the contrary. That's why (2) with its separate
fields would make more sense.

>> Another way that just popped into my mind is to add dedicated fields
>> to XLogCtl that set the stop LSN of a backup the way it should be
>> instead of using minRecoveryPoint. In short we'd update those fields
>> in CreateRestartPoint and UpdateMinRecoveryPoint under
>> XLogCtl->info_lck. The good thing is that this lock is already taken
>> there. See patch (2) accomplishing that.
>
> How is it different/preferable then directly using
> XLogCtl->replayEndRecPtr and XLogCtl->replayEndTLI for stop backup
> purpose? Do you see any problem if we go with what Kyotaro-san has
> proposed in the initial patch [1] (aka using
> XLogCtl->lastReplayedEndRecPtr and XLogCtl->lastReplayedTLI as stop
> backup location)?

Re-reading this thread from scratch and scratching my mind, I am
actually not getting why we bumped into the topic of making
minRecoveryPoint updates more aggressive instead of the first proposal
:)

Knowing that we have no way to be sure if pg_control has been backed
up last or not, using the last replay LSN and TLI would be the most
simple solution, so let's do this for back-branches. It is an
annoyance to not be able to ensure that backups are taken while the
master is stopped or if there is no activity that updates relation
pages.

The thing that is really annoying btw is that there will be always a
gap between minRecoveryPoint and the actual moment where a backup
finishes because there is no way to rely on the XLOG_BACKUP_END
record. On top of that we can not be sure if pg_control has been
backed up last or not. Which is why it would be cool to document that
gap. Another crazy idea would be to return pg_control as an extra
return field of pg_stop_backup() and encourage users to write that
back in the backup itself. This would allow closing any hole in the
current logic for backups taken from live standbys: minRecoveryPoint
would be updated directly to the last replayed LSN/TLI in the control
file.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2016-07-19 06:42:52 Re: Multi-tenancy with RLS
Previous Message Michael Paquier 2016-07-19 04:22:39 Re: Reviewing freeze map code