Re: Issues with Quorum Commit

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Markus Wanner <markus(at)bluegap(dot)ch>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Issues with Quorum Commit
Date: 2010-10-13 04:43:57
Message-ID: AANLkTikb3xu9pQwHrm6gxjcrZXbpxK5PhZWPnyZj6yHE@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Oct 9, 2010 at 12:12 AM, Markus Wanner <markus(at)bluegap(dot)ch> wrote:
> On 10/08/2010 04:48 PM, Fujii Masao wrote:
>> I believe many systems require write-availability.
>
> Sure. Make sure you have enough standbies to fail over to.

Unfortunately even enough standbys don't increase write-availability
unless you choose wait-forever. Because, after promoting one of
standbys to new master, you must keep all the transactions waiting
until at least one standby has connected to and caught up with new
master. Currently this wait time is not short.

> (I think there are even more situations where read-availability is much
> more important, though).

Even so, we should not ignore the write-availability aspect.

>>> Start with 0 (i.e. replication off), then add standbies, then increase
>>> quorum_commit to your new requirements.
>>
>> No. This only makes the procedure of failover more complex.
>
> Huh? This doesn't affect fail-over at all. Quite the opposite, the
> guarantees and requirements remain the same even after a fail-over.

Hmm.. that increases the number of procedures which the users must
perform at the failover. At least, the users seem to have to wait
until the standby has caught up with new master, increase quorum_commit
and then reload the configuration file.

>> What is a full-cluster crash?
>
> The event that all of your cluster nodes are down (most probably due to
> power failure, but fires or other catastrophic events can be other
> causes). Chances for that to happen can certainly be reduced by
> distributing to distant locations, but that equally certainly increases
> latency, which isn't always an option.

Yep.

>> Why does it cause a split-brain?
>
> First master node A fails, a standby B takes over, but then fails as
> well. Let node C take over. Then the power aggregates catches fire, the
> infamous full-cluster crash (where "lights out management" gets a
> completely new meaning ;-) ).
>
> Split brain would be the situation that arises if all three nodes (A, B
> and C) start up again and think they have been the former master, so
> they can now continue to apply new transactions. Their data diverges,
> leading to what could be seen as a split-brain from the outside.
>
> Obviously, you must disallow A and B to take the role of the master
> after recovery.

Yep. Something like STONITH would be required.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2010-10-13 04:47:31 Re: Issues with Quorum Commit
Previous Message Bruce Momjian 2010-10-13 03:59:19 Re: security label support, revised