Re: [BUG] pg_basebackup from disconnected standby fails

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [BUG] pg_basebackup from disconnected standby fails
Date: 2016-10-24 07:39:00
Message-ID: CAB7nPqQZ8sNnZh-Shd23RP=-j1kLhjKHNpKCi72qkCi6XdqSEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 24, 2016 at 1:26 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Fri, Oct 21, 2016 at 10:32 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> 2. In perform_base_backup(), if the endptr returned by
>> do_pg_stop_backup() precedes the end of the checkpoint record returned
>> by do_pg_start_backup(), use the latter value instead. Unfortunately,
>> that's not so easy: we can't just say if (endptr < starptr) startptr =
>> endptr; because startptr is the *start* of the checkpoint record, not
>> the end. I suspect somebody could figure out a solution to this
>> problem, though.
>>
>
> With this approach, don't we need something similar for API's
> pg_stop_backup()/pg_stop_backup_v2()?

Yes, I think so. That would sort of map with the idea I mentioned
upthread to have pg_stop_backup() return the contents of the control
file and have the caller write it to the backup by itself.

>> If we decide we don't want to aim for one of these tighter solutions
>> and just adopt the previously-discussed patch, then at the very least
>> it needs better comments.
>
> +1.

Yeah, here is an attempt at doing that:
- * We return the current minimum recovery point as the backup end
+ * We return the current last replayed point as the backup end
* location. Note that it can be greater than the exact backup end
- * location if the minimum recovery point is updated after the backup of
+ * location if the last replayed point is updated after the backup of
* pg_control. This is harmless for current uses.
*
+ * Using the last replayed point as the backup end location ensures that
+ * the end location will never be older than the start position, something
+ * that could happen if for example minRecoveryPoint is used as backup
+ * end location when it never gets updated because no buffer flushes
+ * occurred. By using the last replay location, note that the backup may
+ * include more WAL segments than necessary. If the additional WAL
+ * replayed since minRecoveryPoint does not include a checkpoint, there
+ * is actually no need for it. Even if it does include a checkpoint,
+ * only the portion up to the checkpoint itself is necessary and not
+ * the WAL generated beyond that. Still, in the case of a backup taken
+ * from a standby, with its master disconnected, this ensures that the
+ * backup is valid.
+ *

Thoughts welcome.
--
Michael

Attachment Content-Type Size
backup-standby-v3.patch invalid/octet-stream 3.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-10-24 07:48:45 Re: Rename max_parallel_degree?
Previous Message Konstantin Knizhnik 2016-10-24 07:12:59 Re: On conflict update & hint bits