Re: Issues with Quorum Commit

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Markus Wanner <markus(at)bluegap(dot)ch>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Issues with Quorum Commit
Date: 2010-10-08 20:04:05
Message-ID: 4CAF7935.7050001@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Markus Wanner wrote:
> ..and how do you make sure you are not marking your second standby as
> degraded just because it's currently lagging? Effectively degrading the
> utterly needed one, because your first standby has just bitten the dust?
>

People are going to monitor the standby lag. If it gets excessive
relative to where it's approaching the known timeout, the flashing
yellow lights should go off at this point, before it gets this bad. And
if you've set a reasonable business oriented timeout on how long you can
stand for the master to be held up waiting for a lagging standby, the
right thing to do may very well be to cut it off. At some point people
will want to stop waiting for a standby if it's taking so long to commit
that it's interfering with the ability of the master to operate
normally. Such a master is already degraded, if your performance
metrics for availability includes processing transactions in a timely
manner.

> And how do you prevent the split brain situation in case the master dies
> shortly after these events, but fails to come up again immediately?
>

How is that a new problem? It's already possible to end up with a
standby pair that has suffered through some bizarre failure chain such
that it's not necessarily obvious which of the two systems has the most
recent set of data on it. And that's not this project's problem to
solve. Useful answers to the split brain problem involve fencing
implementations that normally drop to the hardware level, and clustering
solutions including those features are already available that PostgreSQL
can integrate into. Assuming you have to solve this in order to deliver
a useful database replication component is excessively ambitious.

You seem to be under the assumption that a more complicated replication
implementation here will make reaching a bad state impossible. I think
that's optimistic, both in theory and in regards to how successful code
gets built. Here's the thing: the difficultly of testing to prove your
code actually works is also proportional to that complexity. This
project can chose to commit and potentially ship a simple solution that
has known limitations, and expect that people will fill in the gap with
existing add-on software to handle the clustering parts it doesn't:
fencing, virtual IP address assignment, etc. All while getting useful
testing feedback on the simple bottom layer, whose main purpose in life
is to transport WAL data synchronously. Or, we can argue in favor of
adding additional complexity on top first instead, so we end up with
layers and layers of untested code. That path leads to situations where
you're lucky to ship at all, and when you do the result is difficult to
support.

--
Greg Smith, 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2010-10-08 20:31:41 getting set up on git (finally)
Previous Message Tom Lane 2010-10-08 19:49:24 Re: WIP: Triggers on VIEWs