Re: Online base backup from the hot-standby

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Jun Ishiduka <ishizuka(dot)jun(at)po(dot)ntts(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, ssinger_pg(at)sympatico(dot)ca, cedric(dot)villemain(dot)debian(at)gmail(dot)com, robertmhaas(at)gmail(dot)com, heikki(dot)linnakangas(at)enterprisedb(dot)com
Subject: Re: Online base backup from the hot-standby
Date: 2011-09-22 15:44:44
Message-ID: CABUevEw6-9meRCx1HbJKKACuMLt3y=c8PMe+=hiJp4g2Fyo=Dg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 22, 2011 at 14:13, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Wed, Sep 21, 2011 at 5:34 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> On Wed, Sep 21, 2011 at 08:23, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> On Wed, Sep 21, 2011 at 2:13 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>>> Presumably pg_start_backup() will check this. And we'll somehow track
>>>> this before pg_stop_backup() as well? (for such evil things such as
>>>> the user changing FPW from on to off and then back to on again during
>>>> a backup, will will make it look correct both during start and stop,
>>>> but incorrect in the middle - pg_stop_backup needs to fail in that
>>>> case as well)
>>>
>>> Right. As I suggested upthread, to address that problem, we need to log
>>> the change of FPW on the master, and then we need to check whether
>>> such a WAL is replayed on the standby during the backup. If it's done,
>>> pg_stop_backup() should emit an error.
>>
>> I somehow missed this thread completely, so I didn't catch your
>> previous comments - oops, sorry. The important point being that we
>> need to track if when this happens even if it has been reset to a
>> valid value. So we can't just check the state of the variable at the
>> beginning and at the end.
>
> Right. Let me explain again what I'm thinking.
>
> When FPW is changed, the master always writes the WAL record
> which contains the current value of FPW. This means that the standby
> can track all changes of FPW by reading WAL records.
>
> The standby has two flags: One indicates whether FPW has always
> been TRUE since last restartpoint. Another indicates whether FPW
> has always been TRUE since last pg_start_backup(). The standby
> can maintain those flags by reading WAL records streamed from
> the master.
>
> If the former flag indicates FALSE (i.e., the WAL records which
> the standby has replayed since last restartpoint might not contain
> required FPW), pg_start_backup() fails. If the latter flag indicates
> FALSE (i.e., the WAL records which the standby has replayed
> during the backup might not contain required FPW),
> pg_stop_backup() fails.
>
> If I'm not missing something, this approach can address the problem
> which you're concerned about.

Yeah, it sounds safe to me.

Would it make sense for pg_start_backup() to have the ability to wait
for the next restartpoint in a case like this, if we know that FPW has
been set? Instead of failing? Or maybe that's just overcomplicating
things when trying to be user-friendly.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2011-09-22 15:44:58 Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
Previous Message Daniel Vázquez 2011-09-22 15:39:09 Re: unaccent contrib