Re: Promoting a standby during base backup (was Re: Switching timeline over streaming replication)

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Promoting a standby during base backup (was Re: Switching timeline over streaming replication)
Date: 2012-10-04 17:07:18
Message-ID: CAHGQGwFVENHT4i+Y3x94Xn8onLwVRTy_CV9zRd08D3hoMfnUoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 4, 2012 at 4:59 PM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> On 03.10.2012 18:15, Amit Kapila wrote:
>>
>> On Tuesday, October 02, 2012 4:21 PM Heikki Linnakangas wrote:
>>>
>>> Hmm, should a base backup be aborted when the standby is promoted? Does
>>> the promotion render the backup corrupt?
>>
>>
>> I think currently it does so. Pls refer
>> 1.
>> do_pg_stop_backup(char *labelfile, bool waitforarchive)
>> {
>> ..
>> if (strcmp(backupfrom, "standby") == 0&& !backup_started_in_recovery)
>> ereport(ERROR,
>>
>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>> errmsg("the standby was promoted during
>> online backup"),
>> errhint("This means that the backup
>> being
>> taken is corrupt "
>> "and should not be used.
>> "
>> "Try taking another
>> online
>> backup.")));
>> ..
>>
>> }
>
>
> Okay. I think that check in do_pg_stop_backup() actually already ensures
> that you don't end up with a corrupt backup, even if the standby is promoted
> while a backup is being taken. Admittedly it would be nicer to abort it
> immediately rather than error out at the end.
>
> But I wonder why promoting a standby renders the backup invalid in the first
> place? Fujii, Simon, can you explain that?

Simon had the same question and I answered it before.

http://archives.postgresql.org/message-id/CAHGQGwFU04oO8YL5SUcdjVq3BRNi7WtfzTy9wA2kXtZNHicTeA@mail.gmail.com
---------------------------------------
> You say
> "If the standby is promoted to the master during online backup, the
> backup fails."
> but no explanation of why?
>
> I could work those things out, but I don't want to have to, plus we
> may disagree if I did.

If the backup succeeds in that case, when we start an archive recovery from that
backup, the recovery needs to cross between two timelines. Which means that
we need to set recovery_target_timeline before starting recovery. Whether
recovery_target_timeline needs to be set or not depends on whether the standby
was promoted during taking the backup. Leaving such a decision to a user seems
fragile.

pg_basebackup -x ensures that all required files are included in the backup and
we can start recovery without restoring any file from the archive. But
if the standby is promoted during the backup, the timeline history
file would become
an essential file for recovery, but it's not included in the backup.
---------------------------------------

The situation may change if your switching-timeline patch has been committed.
It's useful if we can continue the backup even if the standby is promoted.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-10-04 18:28:34 Re: Sharing more infrastructure between walsenders and regular backends (was Re: Switching timeline over streaming replication)
Previous Message Jaime Casanova 2012-10-04 16:48:42 Re: Make CREATE AGGREGATE check validity of initcond value?