Re: [BUG] pg_basebackup from disconnected standby fails

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG] pg_basebackup from disconnected standby fails
Date: 2016-07-16 12:20:58
Message-ID: CAA4eK1+6aUv7OkhOJ2oEdSkBsWW8zj44Z__vzvu28a+0e6V8WQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 13, 2016 at 8:56 AM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Tue, Jul 12, 2016 at 12:22 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> I think updating minRecoveryPoint unconditionally can change it's
>> purpose in some cases. Refer below comments in code:
>>
>> * minRecoveryPoint is updated to the latest replayed LSN whenever we
>> * flush a data change during archive recovery. That guards against
>> * starting archive recovery, aborting it, and restarting with an earlier
>> * stop location. If we've already flushed data changes from WAL record X
>> * to disk, we mustn't start up until we reach X again.
>>
>> Now, as per above rule, the value of minRecoveryPoint can be much
>> smaller than XLogCtl->replayEndRecPtr. I think this can change the
>> rules when we can allow read-only connections.
>
> That would delay the moment read-only connections in hot standby are
> allowed in the case where a server is stopped with "immediate", then
> restarted with a different target if no data has been flushed when
> replaying.
>
>> I think your and Kyotaro-san's point that minRecoveryPoint should be
>> updated to support back-ups on standby has merits, but I think doing
>> it unconditionally might lead to change in behaviour in some cases.
>
> If we want to tackle the case I mentioned above, one way is to just
> update minRecoveryPoint when an exclusive or a non-exclusive backup is
> being taken by looking at their status in shared memory. See for
> example the patch (1) attached, but this does not save from the case
> where a node replays WAL, does not have data flushes, and from which a
> backup is taken, in the case where this node gets restarted later with
> the immediate mode and has different replay targets.
>

This looks clumsy as it updates minrecoverypoint for a specific
condition and it doesn't solve the above mentioned inconcistency.

> Another way that just popped into my mind is to add dedicated fields
> to XLogCtl that set the stop LSN of a backup the way it should be
> instead of using minRecoveryPoint. In short we'd update those fields
> in CreateRestartPoint and UpdateMinRecoveryPoint under
> XLogCtl->info_lck. The good thing is that this lock is already taken
> there. See patch (2) accomplishing that.
>

How is it different/preferable then directly using
XLogCtl->replayEndRecPtr and XLogCtl->replayEndTLI for stop backup
purpose? Do you see any problem if we go with what Kyotaro-san has
proposed in the initial patch [1] (aka using
XLogCtl->lastReplayedEndRecPtr and XLogCtl->lastReplayedTLI as stop
backup location)?

[1] - https://www.postgresql.org/message-id/20160609.215558.118976703.horiguchi.kyotaro%40lab.ntt.co.jp

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Seltenreich 2016-07-16 14:13:05 Re: GiST index build versus NaN coordinates
Previous Message Greg Stark 2016-07-16 12:19:58 Re: Regression tests vs existing users in an installation