Hot Standby Failover Scenario

From: Lucky Haryadi <xzorax89(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Hot Standby Failover Scenario
Date: 2012-02-28 03:05:52
Message-ID: CABGr5caenC-PZwZ=OtaTpXONKyazqUX_LAGX3A8xWd27X3nSFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi everybody.

I want to ask about hot-standby related issues. First of all, maybe I will
describe my scenario of Postgres master-slave.

1. There are Master A and Slave B in different location, assumed different
region of nation.
2. Configuring Master A and Slave B to become hot-standby is same as
described in documentations.
3. When Master A fails to service, the database will failovered to Slave B
by triggering with trigger file.
4. As soon as Slave B become standalone pg server, run pg_start_backup(),
so that all transactions will only be recorded to WAL files.
5. Applications swinged to Standalone B, until Server A recovery is done.
6. When Server A has recovered (but still offline), run pg_stop_backup()
and copy all WAL files from B to A.
7. Once the WAL files copied to A, set A's configuration back to Master and
B to Slave again (for B, change recovery.done to recovery.conf and remove
the trigger file).
8. Bring up A, restart B and all applications will be swinged back to A.

I've tried these methods with no luck. Before A fails to service, condition
is A has 10 million records, and B has 10 million records too. Then I
failovered to B, manually, simulating that A failed to service. I run
pg_start_backup() and inserting bunch of data, let say the current
condition is A still 10 million, B 20 million. So I tried to copy WAL files
from B to A and hope that when A up again, the records will intact to B, A
20 million and B 20 million and hot-standby streaming will run as well. But
my experiments failed to do so.
I've checked the log and found that the timeline is invalid. On Slave B's
log, it appeared that timeline of primary server (Master A) does not match
target timeline of standby server. Can anyone suggest for this case? Any
suggestions will be greatly appreciated. Thank you.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2012-02-28 03:30:44 Re: swapcache-style cache?
Previous Message Kyotaro HORIGUCHI 2012-02-28 02:59:02 Re: Speed dblink using alternate libpq tuple storage