Re: Failback to old master

From: "Maeldron T(dot)" <maeldron(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Failback to old master
Date: 2014-11-15 16:31:26
Message-ID: 54677FDE.4080701@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/11/14 14:28, Ants Aasma wrote:
> On Tue, Nov 11, 2014 at 11:52 PM, Maeldron T. <maeldron(at)gmail(dot)com> wrote:
>> As far as I remember (I can’t test it right now but I am 99% sure) promoting the slave makes it impossible to connect the old master to the new one without making a base_backup. The reason is the timeline change. It complains.
> A safely shut down master (-m fast is safe) can be safely restarted as
> a slave to the newly promoted master. Fast shutdown shuts down all
> normal connections, does a shutdown checkpoint and then waits for this
> checkpoint to be replicated to all active streaming clients. Promoting
> slave to master creates a timeline switch, that prior to version 9.3
> was only possible to replicate using the archive mechanism. As of
> version 9.3 you don't need to configure archiving to follow timeline
> switches, just add a recovery.conf to the old master to start it up as
> a slave and it will fetch everything it needs from the new master.
>
I took your advice and I understood that removing the recovery.conf
followed by a restart is wrong. I will not do that on my production servers.

However, I can't make it work with promotion. What did I wrong? It was
9.4beta3.

mkdir 1
mkdir 2
initdb -D 1/
<edit config: change port, wal_level to hot_standby, hot_standby to on,
max_wal_senders=7, wal_keep_segments=100, uncomment replication in hba.conf>
pg_ctl -D 1/ start
createdb -p 5433
psql -p 5433
pg_basebackup -p 5433 -R -D 2/
mcedit 2/postgresql.conf <change port>
chmod -R 700 1
chmod -R 700 2
pg_ctl -D 2/ start
psql -p 5433
psql -p 5434
<everything works>
pg_ctl -D 1/ stop
pg_ctl -D 2/ promote
psql -p 5434
cp 2/recovery.done 1/recovery.conf
mcedit 1/recovery.conf <change port>
pg_ctl -D 1/ start

LOG: replication terminated by primary server
DETAIL: End of WAL reached on timeline 1 at 0/3000AE0.
LOG: restarted WAL streaming at 0/3000000 on timeline 1
LOG: replication terminated by primary server
DETAIL: End of WAL reached on timeline 1 at 0/3000AE0.

This is what I experienced in the past when I tried with promote. The
old master disconnects from the new. What am I missing?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2014-11-15 17:49:41 Re: WIP: multivariate statistics / proof of concept
Previous Message Robert Haas 2014-11-15 12:55:13 Re: Add CREATE support to event triggers