Re: Sync Rep Design

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Hannu Krosing <hannu(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, greg(at)2ndQuadrant(dot)com, Josh Berkus <josh(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sync Rep Design
Date: 2011-01-01 11:54:35
Message-ID: 1293882875.1892.56143.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2010-12-31 at 22:18 +0100, Hannu Krosing wrote:
> On 31.12.2010 13:40, Heikki Linnakangas wrote:
> >
> > Sounds good.
> >
> > I still don't like the synchronous_standbys='' and
> > synchronous_replication=on combination, though. IMHO that still
> > amounts to letting the standby control the behavior on master, and it
> > makes it impossible to temporarily add an asynchronous standby to the mix.
> A sync standby _will_have_ the ability to control the master anyway by
> simply being there or not.
>
> What is currently proposed is having dual power lines / dual UPS' and
> working happily on when one of them fails.
> Requiring both of them to be present defeats the original purpose of
> doubling them.
>
> So following Simons design of 2 standbys and only one required to ACK to
> commit you get 2X reliability of single standby.
...

Yes, working out the math is a good idea. Things are much clearer if we
do that.

Let's assume we have 98% availability on any single server.

1. Having one primary and 2 standbys, either of which can acknowledge,
and we never lock up if both standbys fail, then we will have 99.9992%
server availability. (So PostgreSQL hits "5 Nines", with data
guarantees). ("Maximised availability")

2. Having one primary and 2 standbys, either of which can acknowledge,
and we lock up if both standbys fail to protect the data, then we will
have 99.996% availability. Slightly less availability, but we don't put
data at risk at any time, since any commit is always covered by at least
2 servers. ("Maximised protection")

3. If we have a primary and a single standby which must acknowledge, and
we choose to lock up if the standby fails, then we will have only 96.04%
availability.

4. If we have a primary and two standbys (named or otherwise), both of
which must acknowledge or we lock up the master, then we have an awesome
94.12% availability.

On the last two, there is also an increased likelihood of administrative
cock-ups because of more specific and complex config requirements.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Urbański 2011-01-01 12:24:01 Re: pl/python refactoring
Previous Message Simon Riggs 2011-01-01 11:18:40 Re: RIGHT/FULL OUTER hash joins (was Re: small table left outer join big table)