Re: Sync Rep Design

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sync Rep Design
Date: 2011-01-01 17:29:20
Message-ID: 1293902960.1892.61053.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 2011-01-01 at 18:13 +0100, Stefan Kaltenbrunner wrote:
> On 01/01/2011 05:55 PM, Simon Riggs wrote:
> >
> > It appears to me there has been substantial confusion over alternatives,
> > because of a misunderstanding about how synchronisation works. Requiring
> > confirmation that standbys are in sync is *not* the same thing as them
> > actually being in sync. Every single proposal made by anybody here on
> > hackers that supports multiple standby servers suffers from the same
> > issue: when the primary crashes you need to work out which standby
> > server is ahead.
>
> aaah that was exactly what I was after - so the problem is that when you
> have a sync standby it will technically always be "in front" of the
> master (because it needs to fsync/apply/whatever before the master).
> In the end the question boils down to what is "the bigger problem" in
> the case of a lost master:

> a) a transaction that was confirmed on the master but might not be on
> any of the surviving sync standbys (or you will never know if it is) -
> this is how I understand the proposal so far

No that cannot happen, the current situation is that we will fsync WAL
on the master, then fsync WAL on the standby, then reply to the master.
The standby is never ahead of the master, at any point.

> b) a transaction that was not yet confirmed on the master but might have
> been applied on the surving standby before the desaster - this is what I
> understand "confirm from all sync standbys" could result in.

Yes, that is described in the docs changes I published.

(a) was discussed, but ruled out, since it would require any crash/immed
shutdown of the master to become a failover, or have some kind of weird
back channel to give the missing data back.

There hasn't been any difference of opinion in this area, that I am
aware of. All proposals have offered (b).

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-01-01 17:30:29 Re: Sync Rep Design
Previous Message Simon Riggs 2011-01-01 17:17:53 Re: ALTER TABLE .. SET SCHEMA lock strength