Re: eXtensible Transaction Manager API

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: eXtensible Transaction Manager API
Date: 2015-11-14 14:41:25
Message-ID: CAMsr+YEDA2gb380_i6O5SQeRiwqus0CjCLUQDnGj5teQi1DYHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 13 November 2015 at 21:35, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> On Tue, Nov 10, 2015 at 3:46 AM, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
> > On Sun, Nov 8, 2015 at 6:35 PM, Michael Paquier
> > <michael(dot)paquier(at)gmail(dot)com> wrote:
> >> Sure. Now imagine that the pg_twophase entry is corrupted for this
> >> transaction on one node. This would trigger a PANIC on it, and
> >> transaction would not be committed everywhere.
> >
> > If the database is corrupted, there's no way to guarantee that
> > anything works as planned. This is like saying that criticizing
> > somebody's disaster recovery plan on the basis that it will be
> > inadequate if the entire planet earth is destroyed.
>
> As well as there could be FS, OS, network problems... To come back to
> the point, my point is simply that I found surprising the sentence of
> Konstantin upthread saying that if commit fails on some of the nodes
> we should rollback the prepared transaction on all nodes. In the
> example given, in the phase after calling dtm_end_prepare, say we
> perform COMMIT PREPARED correctly on node 1, but then failed it on
> node 2 because a meteor has hit a server, it seems that we cannot
> rollback, instead we had better rolling in a backup and be sure that
> the transaction gets committed. How would you rollback the transaction
> already committed on node 1? But perhaps I missed something...
>

The usual way this works in an XA-like model is:

In phase 1 (prepare transaction, in Pg's spelling), failure on any node
triggers a rollback on all nodes.

In phase 2 (commit prepared), failure on any node causes retries until it
succeeds, or until the admin intervenes - say, to remove that node from
operation. The global xact as a whole isn't considered successful until
it's committed on all nodes.

2PC and distributed commit is well studied, including the problems. We
don't have to think this up for ourselves. We don't have to invent anything
here. There's a lot of distributed systems theory to work with - especially
when dealing with well studied relational DBs trying to maintain ACID
semantics.

Not to say that there aren't problems with the established ways. The XA API
is horrific. Java's JTA follows it too closely, and whoever thought
that HeuristicMixedException was a good idea.... augh.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2015-11-14 14:45:17 Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Previous Message Michael Paquier 2015-11-14 14:05:57 Re: pg_stat_statements query jumbling question