Re: Why does replication need the old history file?

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why does replication need the old history file?
Date: 2015-06-12 11:44:11
Message-ID: CAHGQGwHLqALiiaVM1_oxt_c4yL7vHXe72FzpfC+_1Mr5rR3QFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 12, 2015 at 5:18 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Fri, Jun 12, 2015 at 4:56 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> Hackers,
>>
>> Sequence of events:
>>
>> 1. PITR backup of server on timeline 2.
>>
>> 2. Restored the backup to a new server, new-master.
>>
>> 3. Restored the backup to another new server, new-replica.
>>
>> 4. Started and promoted new-master (now on Timeline 3).
>>
>> 5. Started new-replica, connecting over streaming to new-master.
>>
>> 6. Get error message:
>>
>> 2015-06-11 12:24:14.503 PDT,,,7465,,5579e05e.1d29,1,,2015-06-11 12:24:14
>> PDT,,0,LOG,00000,"fetching timeline history file for timeline 2 from
>> primary server",,,,,,,,,""
>> 2015-06-11 12:24:14.503 PDT,,,7465,,5579e05e.1d29,2,,2015-06-11 12:24:14
>> PDT,,0,FATAL,XX000,"could not receive timeline history file from the
>> primary server: ERROR: could not open file
>> ""pg_xlog/00000002.history"": No such file or directory
>>
>> Questions:
>>
>> A. Why does the replica need 00000002.history? Shouldn't it only need
>> 00000003.history?
>
> From where is the base backup taken in case of the node started at 5?

The related source code comment says

/*
* Get any missing history files. We do this always, even when we're
* not interested in that timeline, so that if we're promoted to
* become the master later on, we don't select the same timeline that
* was already used in the current master. This isn't bullet-proof -
* you'll need some external software to manage your cluster if you
* need to ensure that a unique timeline id is chosen in every case,
* but let's avoid the confusion of timeline id collisions where we
* can.
*/
WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);

>
>> B. Did something change in this regard in 9.3.6, 9.3.7 or 9.3.8? It was
>> working in our previous setup, on 9.3.5, although that could have just
>> been that the history file hadn't been removed from the backups yet.
>
> At quick glance, I can see nothing in xlog.c between those releases.

Yep, I could reproduce the "trouble" even in 9.3.5 in my laptop.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2015-06-12 11:57:19 Re: The Future of Aggregation
Previous Message Oleg Bartunov 2015-06-12 08:42:09 Re: The purpose of the core team