Re: Sync Rep Design

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, greg(at)2ndquadrant(dot)com, Josh Berkus <josh(at)postgresql(dot)org>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sync Rep Design
Date: 2011-01-04 19:04:37
Message-ID: AANLkTinUVPKCTUSZAiqAv5YsLmANXC_v=RHLo=KqeG_8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 2, 2011 at 4:19 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Sun, 2011-01-02 at 18:54 +0200, Heikki Linnakangas wrote:
>
>> I believe we all agree that there's different use cases that require
>> different setups. Both "first-past-the-post" and "wait-for-all-to-ack"
>> have their uses.
>
> Robert's analysis is that "first-past-the-post" doesn't actually improve
> the durability guarantee (according to his calcs). Which means that
>  1 primary, 2 sync standbys with first-past-the-post
> is actually worse than
>  1 primary, 1 sync and 1 async standby
> in terms of its durability guarantees.
>
> So ISTM that Robert does not agree that both have their uses.

I think it depends on what failure modes you want to protect against.
If you have a primary in New York, a secondary in Los Angeles, and
another secondary in London, you might decide that the chances of two
standbys being taken out by the same event are negligible, or
alternatively that if one event does take out both of them, it'll be
something like a meteor where you'll have bigger things to worry about
than lost transactions. In that case, requiring one ACK but not two
is pretty sensible. If the primary goes down, you'll look at the two
remaining machines (which, by presumption, will still be up) and
promote whichever one is ahead. In this setup, you get a performance
benefit from waiting for either ACK rather than both ACKs, and you
haven't compromised any of the cases you care about.

However, if you have the traditional close/far setup, things are
different. Suppose you have a primary and a secondary in New York and
another secondary in Los Angeles. Now it has to be viewed as a
reasonable possibility that you could lose the New York site. If that
happens, you need to be able to promote the LA standby *without
reference to the NY standby*. So you really can't afford to do the
1-of-2 thing, because then when NY goes away you're not sure whether
the LA standby is safe to promote.

So, IMHO, it just depends on what you want to do.

>> I'm not
>> sure what the point of such a timeout in general is, but people have
>> requested that.
>
> Again, this sounds like you think a timeout has no measurable benefit,
> other than to please some people's perceived needs.
>
>> The "wait-for-all-to-ack" looks a lot less ridiculous if you also
>> configure a timeout and don't wait for disconnected standbys
>
> Does it? Do Robert, Stefan and Aidan agree? What are the availability
> and durability percentages if we do that? Based on those, we may decide
> to do that instead. But I'd like to see some analysis of your ideas, not
> just a "we could". Since nobody has commented on my analysis, lets see
> someone else's.

Here's my take on this point. I think there is a use case for waiting
for a disconnected standby and a use case for not waiting for a
disconnected standby. The danger of NOT waiting for a disconnected
standby is that if you then go on to irretrievably lose the primary,
you lose transactions. But on the other hand, if you do wait, you've
made the primary unavailable. I don't know that there's one right
answer here. For some people, if they can't be certain of recording
the transaction in two places, then it may be better to not process
any transactions at all. For other people, it may be better to
process transactions unprotected for a while while you get a new
standby up. It's not for us to make that judgment; we're here to
provide options.

Having said that, I am OK with whichever one we want to implement
first so long as we keep the door open to doing the other one later.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2011-01-04 19:25:30 Re: system views for walsender activity
Previous Message Joshua D. Drake 2011-01-04 19:04:21 Re: Sync Rep Design