Re: Sync Rep Design

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sync Rep Design
Date: 2011-01-02 20:27:10
Message-ID: AANLkTikG=pxr2AOX9GBsO90g5DcHmbSj7E3GuxoU0JNn@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 1, 2011 at 8:35 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Sat, 2011-01-01 at 05:13 -0800, Jeff Janes wrote:
>> On 12/31/10, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
>> > 2. "sync" does not guarantee that the updates to the standbys are in any
>> > way coordinated. You can run a query on one standby and get one answer
>> > and at the exact same time run the same query on another standby and get
>> > a different answer (slightly ahead/behind). That also means that if the
>> > master crashes one of the servers will be ahead or behind. You can use
>> > pg_last_xlog_receive_location() to check which one that is.
>>
>> If at least one of the standbys is in the same smoking crater as the
>> primary, then pg_last_xlog_receive_location on it is unlikely to
>> respond.
>>
>> The guarantee goes away precisely when it is needed.
>
> Fairly obviously, I would not be advocating anything that forced you to
> use a server in the "same smoking crater".

You are forced to use the standby which is further ahead, otherwise
you might lose transactions which have been reported to have been
committed.

The mere existence of a commit-releasing stand-by in the same data
center as the primary means that a remote standby is not very useful
for data preservation after campus-wide disasters. It is probably
behind (due to higher latency) and even if it is not behind, there is
no way to *know* that is not behind if the on-site standby cannot be
contacted.

I understand that you are not advocating the use of one local standby
and one remote standby, both synchronous. But I think we need to
*explicitly* warn against it. After all, the docs do explicitly
recommend the use of two standbys. If we assume that the readers are
already experts, then they don't need that advice. If they are not
experts, then that advice could lead them to shoot themselves in the
foot, both kneecaps, and a femur (metaphorically speaking, unlike the
smoking crater, which is a literal scenario some people need to plan
for).

If durability is more important than availability, what would you
recommend? Only one synchronous rep, in a remote data center? Two
(or more) synchronous reps all in the same remote data center? In two
different remote data centers?

> I can't see any guarantee
> that goes away precisely when it is needed.

In order to know that you are not losing data, you have to be able to
contact every single semi-synchronous standby and invoke
pg_last_xlog_receive_location on it.

If your goal is to have data durability protected from major
catastrophes (and why else would you do synchronous rep to remote data
centers?), then it is expecting a lot to have every single standby
survive that major catastrophe. That expectation is an unavoidable
consequence of going with single-confirmation-releases. Perhaps you
think this consequence is too obvious to document--if so I disagree on
that.

> Perhaps you could explain the issue you see, because your comments seem
> unrelated to my point above.

It is directly related to the part of your point about using
pg_last_xlog_receive_location. When planning for disaster recovery,
it is little comfort that you can do something in a non-disaster case,
if you can't also do it in likely disaster cases.

It probably wasn't relevant to the first part of your point, but I
must admit I did not understand the first part of your point.
Obviously they are coordinated in *some* way (I believe commits occur
in the same order on each server, for example). Different read-only
standbys could give different results, but only from among the
universe of results made possible by a given commit sequence. But
that is not the part I had intended to comment on, and I don't think
it is what other people concerned about durability after major
catastrophes were focusing on, either.

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message pasman pasmański 2011-01-02 20:37:00 Re: managment of large patches
Previous Message MARK CALLAGHAN 2011-01-02 20:13:46 Re: Sync Rep Design