Re: Issues with Quorum Commit

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Issues with Quorum Commit
Date: 2010-10-13 21:44:00
Message-ID: AANLkTikccYZmZCBs6U912_r0XKYD16Ynx1F-k8VZRoBW@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 13, 2010 at 5:22 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Wed, Oct 13, 2010 at 3:50 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> There's another problem here we should think about, too.  Suppose you
>> have a master and two standbys.  The master dies.  You promote one of
>> the standbys, which turns out to be behind the other.  You then
>> repoint the other standby at the one you promoted.  Congratulations,
>> your database is now very possible corrupt, and you may very well get
>> no warning of that fact.  It seems to me that we would be well-advised
>> to install some kind of bullet-proof safeguard against this kind of
>> problem, so that you will KNOW that the standby needs to be re-synced.
>
> Yep. This is why I said it's not easy to implement that.
>
> To start the standby without taking a base backup from new master after
> failover, the user basically has to promote the standby which is ahead
> of the other standbys (e.g., by comparing pg_last_xlog_replay_location
> on each standby).
>
> As the safeguard, we seem to need to compare the location at the switch
> of the timeline on the master with the last replay location on the standby.
> If the latter location is ahead AND the timeline ID of the standby is not
> the same as that of the master, we should emit warning and terminate the
> replication connection.

That doesn't seem very bullet-proof. You can accidentally corrupt a
standby even when only one time-line is involved. AFAIK, stopping a
standby, removing recovery.conf, and starting it up again does not
change time lines. You can even shut down the standby, bring it up as
a master, generate a little WAL, shut it back down, and bring it back
up as a standby pointing to the same master. It would be nice to
embed in each checkpoint record an identifier that changes randomly on
each transition to normal running, so that if you do something like
this we can notice and complain loudly.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-10-13 21:45:03 Re: [HACKERS] Docs for archive_cleanup_command are poor
Previous Message Dimitri Fontaine 2010-10-13 21:11:21 Re: Extensions, this time with a patch