Re: Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, MARK CALLAGHAN <mdcallag(at)gmail(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.
Date: 2011-03-18 16:33:26
Message-ID: AANLkTinSxW_DrJwwfcOoEoJV-UeuVZjrtoSKeP5R69WE@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Fri, Mar 18, 2011 at 12:19 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Fri, 2011-03-18 at 17:47 +0200, Heikki Linnakangas wrote:
>> On 18.03.2011 16:52, Kevin Grittner wrote:
>> > Simon Riggs<simon(at)2ndQuadrant(dot)com>  wrote:
>> >
>> >> In PostgreSQL other users cannot observe the commit until an
>> >> acknowledgement has been received.
>> >
>> > Really?  I hadn't picked up on that.  That makes for a lot of
>> > complication on crash-and-recovery of a master, but if we can pull
>> > it off, that's really cool.  If we do that and MySQL doesn't, we
>> > definitely don't want to use the same terminology they do, which
>> > would imply the same behavior.
>>
>> To be clear: other users cannot observe the commit until standby
>> acknowledges it - unless the master crashes while waiting for the
>> acknowledgment. If that happens, the commit will be visible to everyone
>> after recovery.
>
> No, only in the case where you choose not to failover to the standby
> when you crash, which would be a fairly strange choice after the effort
> to set up the standby. In a correctly configured and operated cluster
> what I say above is fully correct and needs no addendum.

Except it doesn't work that way. If, say, a backend on the master
core dumps, the system will perform a crash and restart cycle, and the
transaction will become visible whether it's yet been replicated or
not. Since we now have a GUC to suppress restart after a backend
crash, it's theoretically possible to set up the system so that this
doesn't occur, but it'd take quite a bit of work to make it robust and
automatic, and it's certainly not the default out of the box.

The fundamental problem here is that once you update CLOG and flush
the corresponding WAL record, there is no going backward. You can
hold the system in some intermediate state where the transaction still
holds locks and is excluded from MVCC snapshots, but there's no way to
back up. So there are bound to be corner cases where the where the
wait doesn't last as long as you want, and stuff leaks out around the
edges. It's fundamentally impossible to guarantee that you'll remain
in that intermediate state forever - what do you do if a meteor hits
the synchronous standby and at the same time you lose power to the
master? No amount of configuration will save you from coming back on
line with a visible-but-unreplicated transaction. I'm not knocking
the system; I think what we have is impressively good. But pretending
that corner cases can't happen gets us nowhere.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Kevin Grittner 2011-03-18 16:48:57 Re: Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.
Previous Message Kevin Grittner 2011-03-18 16:27:57 Re: Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-03-18 16:37:44 Re: Sync Rep and shutdown Re: Sync Rep v19
Previous Message Kevin Grittner 2011-03-18 16:27:57 Re: Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.