Quick Links

Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject:	Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
Date:	2013-01-17 04:47:41
Message-ID:	CAB7nPqTFmPCo1wR-AQEsL1cUQGR76WDdfOXbacwKSfH883p6fw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi all,

There is a strange bug with the latest master head (commit 7fcbf6a).
When the WAL stream with a master is cut on a slave, slave returns a FATAL
(well normal...), but then enters in recovery process and automatically
promotes.
Here are more details about the logs on slave (I simply killed the master
manually):
FATAL: could not receive data from WAL stream:
cp: cannot stat
‘/home/michael/bin/pgsql/archive/master/000000010000000000000004’: No such
file or directory
LOG: record with zero length at 0/401E1B8
LOG: redo done at 0/401E178
LOG: last completed transaction was at log time 2013-01-17
20:27:53.180971+09
cp: cannot stat ‘/home/michael/bin/pgsql/archive/master/00000002.history’:
No such file or directory
LOG: selected new timeline ID: 2
cp: cannot stat ‘/home/michael/bin/pgsql/archive/master/00000001.history’:
No such file or directory
LOG: archive recovery complete
DEBUG: resetting unlogged relations: cleanup 0 init 1
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
DEBUG: archived transaction log file "000000010000000000000004"
DEBUG: archived transaction log file "00000002.history"
LOG: statement: create table bn (a int);
DEBUG: autovacuum: processing database "postgres"

Slave does not try anymore to reconnect to master with messages of the type:
FATAL: could not connect to the primary server

I also noticed that there is some delay until modifications on master are
visible on slave.
For example run a simple CREATE TABLE and the new table is not

[some bisecting later...]

I think that bug has been introduced by commit 7fcbf6a.
Before splitting xlog reading as a separate facility things worked
correctly.
There are also no delay problems before this commit.

Does someone else noticed that?
--
Michael Paquier
http://michael.otacoo.com

Responses

Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave at 2013-01-17 10:54:00 from Andres Freund
Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave at 2013-01-17 13:05:15 from Andres Freund
Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave at 2013-01-17 17:35:20 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Abhijit Menon-Sen	2013-01-17 04:48:59	Re: CF3+4
Previous Message	Craig Ringer	2013-01-17 04:38:51	Re: CF3+4 (was Re: Parallel query execution)