Re: Synchronous commit not... synchronous?

From: Daniel Farina <daniel(at)heroku(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Peter van Hardenberg <pvh(at)pvh(dot)ca>, "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Synchronous commit not... synchronous?
Date: 2012-11-02 20:46:47
Message-ID: CAAZKuFZ+UXdd0Hm_UOu5Z9n+8UUG7ZsaNG3q-S2npKhnfozA4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 2, 2012 at 1:06 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>> I see why it is implemented this way, but it's also still pretty
>> unsatisfying because it means that with cancellation requests clients
>> are in theory able to commit an unlimited number of transactions,
>> synchronous commit or no.
>
> What evil does this allow the client to perpetrate?

The client can commit against my will by accident in an automated
system whose behavior is at least moderately complex and hard to
understand completely for all involved, and then the client's author
subsequently writes me a desperate or angry support request asking why
data was lost. This is not the best time for me to ask "did you setup
a scheduled task to cancel hanging queries automatically? Because
yeah...there's this...thing."

>> It's probably close enough for most purposes, but what would you think
>> about a "2PC-ish" mode at the physical (rather than logical/PREPARE
>> TRANSACTION) level, whereby the master would insist that its standbys
>> have more data written (or at least received...or at least sent) than
>> it has guaranteed flushed to its own xlog at any point?
>
> Then if they interrupt the commit, the remote has it permanently but
> the local does not. That would be corruption.

That is a good point.

When the server starts up it could interrogate it standbys for WAL to
apply. My ideal is to get a similar relationship between a master and
its 'local' pg_xlog, except over socket, and possibly (but entirely
optionally) to a non-Postgres receiver of WAL, that may buffer WAL and
then submit it directly to what is typically thought of as the
archives. I have a number of reasons for doing that, but they can all
be summed as: block devices are much more prone to failures -- both
simple and byzantine -- than memory and network traffic with enough
fidelity checking (such as TLS), and the pain from block device
failures -- in particular, the byzantine ones -- is very high when
they occur.

The bar for "reliable" non-volatile storage for me are things like
Amazon's S3, and I think a lot of that has to do with the otherwise
relatively impoverished semantics it has, so I think this reliability
profile will be or has been duplicated elsewhere.

In general, this has some relation to remastering issues.

In the future, I'd like to be able to turn off the local pg_xlog, at my option.

This is something that I've been very slowly moving forward on for a
while, with the first step being writing a Postgres proxy, currently
underway. The tool support for this kind of facility is not really in
existence yet, but I'll catch up some day...

> What the "DETAIL" doesn't make clear about the current system is that
> the commit *will* be replicated to the standby *eventually*, unless
> the master burns down first. In particular, if any commit after this
> one makes it to the standby, then the interrupted one is guaranteed to
> have made it as well.
>
>> This would be a nice invariant to have when dealing with a large
>> number of systems, allowing for the catching of some tricky bugs, that
>> standbys are always greater-than-or-equal-to the master's XLogPos.
>
> Could you elaborate on that?

Sure. I'd like to sanity check failovers with as many simple
invariants as I can to catch problems. Losing a cheap to confirm
invariant is losing a check, so that would be unfortunate when doing
more failovers simultaneously than humans can realistically be
involved with in a short amount of time, as the results of a bug there
are most unpleasant.

--
fdr

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2012-11-02 21:07:39 Re: [v9.3] writable foreign tables
Previous Message Jeff Janes 2012-11-02 20:06:56 Re: Synchronous commit not... synchronous?