9.0 replication -- multiple hot_standby servers

From: "Dean Gibson (DB Administrator)" <postgresql(at)ultimeth(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: 9.0 replication -- multiple hot_standby servers
Date: 2010-10-29 21:32:15
Message-ID: 4CCB3D5F.4090102@ultimeth.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Previously posted in [ADMIN]:

Two days ago I upgraded five DB boxes (for load balancing) from 8.3.0 to
9.0.1 in order to use replication. The replication configuration went
reasonably well, and now all the four "hot_standby" servers are
(streaming) replicating just fine from the primary DB server. If the
primary fails and I "touch" the trigger file on one of the standby
boxes, that goes into primary mode just as it should. Of course, I have
to externally redirect updates to the new server.

*Question #1*: How do I configure the other three (still) hot_standby
boxes to now use the new primary ??? Clearly I can change the
"recovery.conf" file on each standby box, but that seems like an
unnecessary nuisance.

What I would like to do is put multiple "primary_conninfo" lines into
the "recovery.conf" file, and have each standby server dynamically pick
the one that is the primary. Further, when the primary goes down, the
standby should re-choose the new primary using the multiple
"primary_conninfo" lines. Does that work ???

Such a feature (if it does not already exist) would really be useful !!!

*Question #2*: While the documentation described how to switch a
standby server from hot_standby to primary, I didn't see how to switch
it back to hot_standby. Is the following the best (or only) method ???

1. Optionally, bring the previous primary back up into hot_standby mode.
2. STOP the current primary server.
3. Using the trigger file on another hot-standby box, bring it into
primary mode.
4. Externally, start redirecting updates to the new primary server.
5. On the stopped server, delete the trigger file, and rename the
"recovery.done" file back to "recovery.conf".
6. Restart the stopped server; it should come back up into
hot_standby mode.

Ideally, this could be done seamlessly without the (small) possibility
of lost updates, but I don't see how to do that. Fortunately, it's not
a big issue in my case.

*Opinion:* Here's what I think would be *ideal* from an update
integrity standpoint:

1. If a hot_standby box gets a non-read-only SQL transaction, if
checks to see if it is still talking to a primary server:
* If it is, it does whatever is done now (I presume it reports
an error).
* If not, it "touches" the trigger file and switches into
primary mode. If there are other hot_standby servers,
ideally the new primary signals them that it is the new
primary (in order to keep them from accessing the old
primary, which may still be visible to them).
2. This way, an external application feeding SQL update-type
transactions, upon finding that the old primary is down, just
switches to any hot_standby server as its new target and retries
the update; this will automatically cause the desired hot_standby
server to switch to primary.
3. Since we don't know what the issue was with the the previous
primary, it needs to be recovered manually (and isolated from
other hot_standby servers in the meantime).

Sincerely, Dean

--
Mail to my list address MUST be sent via the mailing list.
All other mail to my list address will bounce.

Browse pgsql-admin by date

  From Date Subject
Next Message Dimitri Fontaine 2010-10-31 18:45:53 Re: large database: problems with pg_dump and pg_restore
Previous Message Kevin Grittner 2010-10-29 20:13:33 Re: Logfile