Re: Sync Rep Design

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sync Rep Design
Date: 2011-01-01 14:03:35
Message-ID: 4D1F3437.10503@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/01/2011 02:13 PM, Jeff Janes wrote:
> On 12/31/10, Simon Riggs<simon(at)2ndquadrant(dot)com> wrote:
>> On Fri, 2010-12-31 at 09:27 +0100, Stefan Kaltenbrunner wrote:
>>
>>> Maybe it has been discussed but I still don't see way it makes any
>>> sense. If I declare a standby a sync standby I better want it sync - not
>>> "maybe sync". consider the case of a 1 master and two identical sync
>>> standbys - one sync standby is in the same datacenter the other is in a
>>> backup location say 15km away.
>>> Given there is a small constant latency to the second box (even if you
>>> have fast networks) the end effect is that the second standby will NEVER
>>> be sync (because the local one will always be faster) and you end up
>>> with an async slave that cannot be used per your business rules?
>>
>> Your picture above is a common misconception. I will add something to
>> the docs to explain this.
>>
>> 1. "sync" is a guarantee about how we respond to the client when we
>> commit. If we wait for more than one response that slows things down,
>> makes the cluster more fragile, complicates the code and doesn't
>> appreciably improve the guarantee.
>
> Whether it is more fragile depends on if you look at up-time fragility
> or durability fragility. I think it can appreciably improve the
> guarantee.
>
>>
>> 2. "sync" does not guarantee that the updates to the standbys are in any
>> way coordinated. You can run a query on one standby and get one answer
>> and at the exact same time run the same query on another standby and get
>> a different answer (slightly ahead/behind). That also means that if the
>> master crashes one of the servers will be ahead or behind. You can use
>> pg_last_xlog_receive_location() to check which one that is.
>
> If at least one of the standbys is in the same smoking crater as the
> primary, then pg_last_xlog_receive_location on it is unlikely to
> respond.
>
> The guarantee goes away precisely when it is needed.

that is exactly my point - if have no guarantee that your SYNC standby
is actually sync there is no use for it being used in business cases
that require sync replication.
If we cannot support that usecase I would either like to see us
restricting to only one sync capable standby or by putting a big CAVEAT
into the docs saying that sync replication in pg only is a hint and not
a guarantee that might or might not be honored in the case of more than
one standby.

Stefan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-01-01 14:15:45 Re: Sync Rep Design
Previous Message Stefan Kaltenbrunner 2011-01-01 13:59:06 Re: Sync Rep Design