Re: How should pg_standby get over the gap of timeline?

From: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How should pg_standby get over the gap of timeline?
Date: 2008-11-21 03:03:45
Message-ID: 3f0b79eb0811201903v1be9742dp48bf944db76de3df@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 21, 2008 at 12:06 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Fujii Masao wrote:
>>
>> Hi, Heikki. Thanks for the comment!
>>
>> On Thu, Nov 20, 2008 at 11:24 PM, Heikki Linnakangas
>> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>>
>>> Fujii Masao wrote:
>>>>
>>>> In the current Synch Rep patch, the standby cannot catch up with the
>>>> primary which has a bigger timeline.
>>>
>>> That would only happen if you've performed an archive recovery in the
>>> primary. If you've done PITR in the primary, I don't think there's any
>>> guarantee that it's even possible to catch up the standby. The standby
>>> might
>>> already have replayed a WAL file from an earlier timeline, that isn't
>>> part
>>> of the history of the bigger timeline.
>>
>> I assume the situation of making the standby (the original primary) catch
>> up
>> with the primary (the original standby) after failover. Since a timeline
>> is
>> incremented when a failover finishes archive recovery on a standby, the
>> timelines differ between two servers.
>
> That seems like a dangerous assumption. What if the standby had fallen
> behind before the failover? It's not safe to failover back to the original
> primary in that case. We'd need some kind of safeguards against that.

Yeah, it's a legitimate concern. As the safeguard, I'm going to delete the
XLOG files which may be inconsistent from the standby before making it
catch up. The XLOG file including the recovery starting point and the
subsequent ones may be inconsistent. Then, they need to be copied from
the primary. I'm writing down the draft of this procedure at wiki.
http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects#Procedure

But, it's overkill to overwrite all the XLOG files which may be inconsistent.
In the future, I'm going to provide the tool to compare the content of XLOG
between two servers and tell the user which files should be overwritten.

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2008-11-21 03:39:57 Re: How should pg_standby get over the gap of timeline?
Previous Message Bruce Momjian 2008-11-21 02:59:00 Re: Updates of SE-PostgreSQL 8.4devel patches (r1197)