Re: [BUG] pg_basebackup from disconnected standby fails

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG] pg_basebackup from disconnected standby fails
Date: 2016-07-13 03:26:14
Message-ID: CAB7nPqR7GGvNc2nm34TFWhFj7cxNQGUFaVbmh=ZTU=DkoB1Z=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 12, 2016 at 12:22 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> I think updating minRecoveryPoint unconditionally can change it's
> purpose in some cases. Refer below comments in code:
>
> * minRecoveryPoint is updated to the latest replayed LSN whenever we
> * flush a data change during archive recovery. That guards against
> * starting archive recovery, aborting it, and restarting with an earlier
> * stop location. If we've already flushed data changes from WAL record X
> * to disk, we mustn't start up until we reach X again.
>
> Now, as per above rule, the value of minRecoveryPoint can be much
> smaller than XLogCtl->replayEndRecPtr. I think this can change the
> rules when we can allow read-only connections.

That would delay the moment read-only connections in hot standby are
allowed in the case where a server is stopped with "immediate", then
restarted with a different target if no data has been flushed when
replaying.

> I think your and Kyotaro-san's point that minRecoveryPoint should be
> updated to support back-ups on standby has merits, but I think doing
> it unconditionally might lead to change in behaviour in some cases.

If we want to tackle the case I mentioned above, one way is to just
update minRecoveryPoint when an exclusive or a non-exclusive backup is
being taken by looking at their status in shared memory. See for
example the patch (1) attached, but this does not save from the case
where a node replays WAL, does not have data flushes, and from which a
backup is taken, in the case where this node gets restarted later with
the immediate mode and has different replay targets.

Another way that just popped into my mind is to add dedicated fields
to XLogCtl that set the stop LSN of a backup the way it should be
instead of using minRecoveryPoint. In short we'd update those fields
in CreateRestartPoint and UpdateMinRecoveryPoint under
XLogCtl->info_lck. The good thing is that this lock is already taken
there. See patch (2) accomplishing that.

Thinking about that, patch (2) is far cleaner, and takes care of not
advancing minRecoveryPoint where not needed, but it does it for the
base backups as they should be. So the dependency between the minimum
recovery point and the end locations of a backup get reduced.
--
Michael

Attachment Content-Type Size
backup-standby-v5-1.patch text/x-patch 4.6 KB
backup-standby-v5-2.patch text/x-patch 5.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-07-13 03:29:49 RecoveryTargetTLI dead variable in XLogCtlData
Previous Message Masahiko Sawada 2016-07-13 01:55:04 Re: pg_basebackup wish list