Re: Sync Rep: Second thoughts

From: Emmanuel Cecchet <manu(at)frogthinker(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Markus Wanner <markus(at)bluegap(dot)ch>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, aidan(at)highrise(dot)ca, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sync Rep: Second thoughts
Date: 2008-12-14 09:08:05
Message-ID: 4944CCF5.5090101@frogthinker.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

I just wanted to point out a detail that I have not seen mentioned in
this thread (but I might have skipped some messages and I apologize in
advance if this is a duplicate).

What the application is going to see is a failure when the postmaster it
is connected to is going down. If this happen at commit time, I think
that there is no guarantee for the application to know what happened:
1. failure occurred before the request reached postmaster: no instance
committed
2. failure occurred during commit: might be committed on either nodes
3. failure occurred while sending back ack of commit to client: both
instances have committed
But for the client, it will all look the same: an error on commit.

This is just to point out that despite all your efforts, the client
might think that some transactions have failed (error on commit) but
they are actually committed. If you don't put some state in the driver
that is able to check at failover time if the commit operation succeeded
or not, it does not really matter what happens for in-flight
transactions (or in-commit transactions) at failure time. In all cases,
a manual inspection of the database logs will be required.
Actually, if there was a way to query the database about the status of a
particular transaction by providing a cluster-wide unique id, that would
help a lot. I wrote a paper on the issues with database replication at
Sigmod earlier this year (http://infoscience.epfl.ch/record/129042).
Even though it was targeted at middleware replication, I think that some
of it is still relevant for the problem at hand.

Regarding the wording, if experts can't agree, you can be sure that
users won't either. Most of them don't have a clue about the different
flavors of replication. So as long as you state clearly how it behaves
and define all the terms you use that should be fine.

manu

--
Emmanuel Cecchet
FTO @ Frog Thinker
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu(at)frogthinker(dot)org
Skype: emmanuel_cecchet

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emmanuel Cecchet 2008-12-14 09:17:52 Re: Sync Rep: First Thoughts on Code
Previous Message Mark Mielke 2008-12-14 08:46:27 Re: Sync Rep: First Thoughts on Code