Re: Failback to old master

From: didier <did447(at)gmail(dot)com>
To: "Maeldron T(dot)" <maeldron(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Failback to old master
Date: 2014-11-16 12:13:48
Message-ID: CAJRYxuJ7nBoQa3B_vW_8PwLx40VJn1tLKNPzMX6SrCKLXt=YpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Sat, Nov 15, 2014 at 5:31 PM, Maeldron T. <maeldron(at)gmail(dot)com> wrote:
>> A safely shut down master (-m fast is safe) can be safely restarted as
>> a slave to the newly promoted master. Fast shutdown shuts down all
>> normal connections, does a shutdown checkpoint and then waits for this
>> checkpoint to be replicated to all active streaming clients. Promoting
>> slave to master creates a timeline switch, that prior to version 9.3
>> was only possible to replicate using the archive mechanism. As of
>> version 9.3 you don't need to configure archiving to follow timeline
>> switches, just add a recovery.conf to the old master to start it up as
>> a slave and it will fetch everything it needs from the new master.
>>
> I took your advice and I understood that removing the recovery.conf followed
> by a restart is wrong. I will not do that on my production servers.
>
> However, I can't make it work with promotion. What did I wrong? It was
> 9.4beta3.
>
> mkdir 1
> mkdir 2
> initdb -D 1/
> <edit config: change port, wal_level to hot_standby, hot_standby to on,
> max_wal_senders=7, wal_keep_segments=100, uncomment replication in hba.conf>
> pg_ctl -D 1/ start
> createdb -p 5433
> psql -p 5433
> pg_basebackup -p 5433 -R -D 2/
> mcedit 2/postgresql.conf <change port>
> chmod -R 700 1
> chmod -R 700 2
> pg_ctl -D 2/ start
> psql -p 5433
> psql -p 5434
> <everything works>
> pg_ctl -D 1/ stop
> pg_ctl -D 2/ promote
> psql -p 5434
> cp 2/recovery.done 1/recovery.conf
> mcedit 1/recovery.conf <change port>
> pg_ctl -D 1/ start
>
> LOG: replication terminated by primary server
> DETAIL: End of WAL reached on timeline 1 at 0/3000AE0.
> LOG: restarted WAL streaming at 0/3000000 on timeline 1
> LOG: replication terminated by primary server
> DETAIL: End of WAL reached on timeline 1 at 0/3000AE0.
>
> This is what I experienced in the past when I tried with promote. The old
> master disconnects from the new. What am I missing?
>
I think you have to add
recovery_target_timeline = '2'
in recovery.conf
with '2' being the new primary timeline .
cf http://www.postgresql.org/docs/9.4/static/recovery-target-settings.html

Didier

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-11-16 13:01:06 Re: alternative model for handling locking in parallel groups
Previous Message Michael Paquier 2014-11-16 12:07:27 Re: Review of Refactoring code for sync node detection