Skip site navigation (1) Skip section navigation (2)

Re: Sync Rep Design

From: Hannu Krosing <hannu(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, greg(at)2ndquadrant(dot)com, Josh Berkus <josh(at)postgresql(dot)org>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sync Rep Design
Date: 2011-01-02 10:59:32
Message-ID: 4D205A94.6070205@2ndquadrant.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On 2.1.2011 5:36, Robert Haas wrote:
> On Sat, Jan 1, 2011 at 6:54 AM, Simon Riggs<simon(at)2ndquadrant(dot)com>  wrote:
>> Yes, working out the math is a good idea. Things are much clearer if we
>> do that.
>>
>> Let's assume we have 98% availability on any single server.
>>
>> 1. Having one primary and 2 standbys, either of which can acknowledge,
>> and we never lock up if both standbys fail, then we will have 99.9992%
>> server availability. (So PostgreSQL hits "5 Nines", with data
>> guarantees). ("Maximised availability")
> I don't agree with this math.  If the master and one standby fail
> simultaneously, the other standby is useless, because it may or may
> not be caught up with the master.  You know that the last transaction
> acknowledged as committed by the master is on at least one of the two
> standbys, but you don't know which one, and so you can't safely
> promote the surviving standby.
> (If you are working in an environment where promoting the surviving
> standby when it's possibly not caught up is OK, then you don't need
> sync rep in the first place: you can just run async rep and get much
> better performance.)
> So the availability is 98% (you are up when the master is up) + 98%^2
> * 2% (you are up when both slaves are up and the master is down) =
> 99.92%.  If you had only a single standby, then you could be certain
> that any commit acknowledged by the master was on that standby.  Thus
> your availability would be 98% (up when master is up) + 98% * 2% (you
> are up when the master is down and the slave is up) = 99.96%.
>
OTOH, in the case where you need _all_ the slaves to confirm any failing 
slave brings
the master down, so adding a slave brings down availability by extra 2%

The solution to achieving good durability AND availability is requiring 
N past the
post instead of 1 past the post.

In this case you can get to 99.9992% availability with master + 3 sync 
slaves, 2 of which have ACK.

---------------------------------------
Hannu Krosing
Performance and Infinite Scalability Consultant
http://www.2ndQuadrant.com/books/




In response to

pgsql-hackers by date

Next:From: Jan UrbaƄskiDate: 2011-01-02 11:41:24
Subject: Re: pl/python refactoring
Previous:From: Dimitri FontaineDate: 2011-01-02 10:57:02
Subject: Re: Extension upgrade, patch v0: debug help needed

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group