Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
Date: 2013-01-17 18:05:47
Message-ID: CAHGQGwEPfYMwbsqTXoKaB6KSm+Wg1ueToLj9cFnfs5Fz9=LZpw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 18, 2013 at 2:35 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2013-01-17 13:47:41 +0900, Michael Paquier wrote:
>> Slave does not try anymore to reconnect to master with messages of the type:
>> FATAL: could not connect to the primary server
>>
>> I also noticed that there is some delay until modifications on master are
>> visible on slave.
>
>> I think that bug has been introduced by commit 7fcbf6a.
>> Before splitting xlog reading as a separate facility things worked
>> correctly.
>> There are also no delay problems before this commit.
>
> Heikki committed a fix for at least the promotion issue, I didn't notice
> any problem with an increased delay, could you check again if you still
> see it?

I encountered the problem that the timeline switch is not performed expectedly.
I set up one master, one standby and one cascade standby. All the servers
share the archive directory. restore_command is specified in the recovery.conf
in those two standbys.

I shut down the master, and then promoted the standby. In this case, the
cascade standby should switch to new timeline and replication should be
successfully restarted. But the timeline was never changed, and the following
log messages were kept outputting.

sby2 LOG: restarted WAL streaming at 0/3000000 on timeline 1
sby2 LOG: replication terminated by primary server
sby2 DETAIL: End of WAL reached on timeline 1
sby2 LOG: restarted WAL streaming at 0/3000000 on timeline 1
sby2 LOG: replication terminated by primary server
sby2 DETAIL: End of WAL reached on timeline 1
sby2 LOG: restarted WAL streaming at 0/3000000 on timeline 1
sby2 LOG: replication terminated by primary server
sby2 DETAIL: End of WAL reached on timeline 1
....

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-01-17 18:08:30 Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
Previous Message Stephen Frost 2013-01-17 17:37:00 Re: could not create directory "...": File exists