Skip site navigation (1) Skip section navigation (2)

Re: Sync Rep Design

From: Hannu Krosing <hannu(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sync Rep Design
Date: 2011-01-03 01:30:12
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
On 2.1.2011 5:36, Robert Haas wrote:
> On Sat, Jan 1, 2011 at 6:54 AM, Simon Riggs<simon(at)2ndquadrant(dot)com>  wrote:
>> Yes, working out the math is a good idea. Things are much clearer if we
>> do that.
>> Let's assume we have 98% availability on any single server.
>> 1. Having one primary and 2 standbys, either of which can acknowledge,
>> and we never lock up if both standbys fail, then we will have 99.9992%
>> server availability. (So PostgreSQL hits "5 Nines", with data
>> guarantees). ("Maximised availability")
> I don't agree with this math.  If the master and one standby fail
> simultaneously, the other standby is useless, because it may or may
> not be caught up with the master.  You know that the last transaction
> acknowledged as committed by the master is on at least one of the two
> standbys, but you don't know which one, and so you can't safely
> promote the surviving standby.
> (If you are working in an environment where promoting the surviving
> standby when it's possibly not caught up is OK, then you don't need
> sync rep in the first place: you can just run async rep and get much
> better performance.)
> So the availability is 98% (you are up when the master is up) + 98%^2
> * 2% (you are up when both slaves are up and the master is down) =
> 99.92%.  If you had only a single standby, then you could be certain
> that any commit acknowledged by the master was on that standby.  Thus
> your availability would be 98% (up when master is up) + 98% * 2% (you
> are up when the master is down and the slave is up) = 99.96%.
OTOH, in the case where you need _all_ the slaves to confirm any failing 
slave brings
the master down, so adding a slave brings down availability by extra 2%

The solution to achieving good durability AND availability is requiring 
N past the
post instead of 1 past the post.

In this case you can get to 99.9992% availability with master + 3 sync 
slaves, 2 of which have ACK.

Hannu Krosing
Performance and Infinite Scalability Consultant

In response to

pgsql-hackers by date

Next:From: Greg SmithDate: 2011-01-03 06:53:58
Subject: Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid
Previous:From: Andrew DunstanDate: 2011-01-03 01:14:55
Subject: Re: contrib/snapshot

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group