Re: Transactions involving multiple postgres foreign servers

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transactions involving multiple postgres foreign servers
Date: 2015-01-07 07:16:00
Message-ID: CAFjFpRc_FX=DiA3vVOtBFg2OimKdCsVWLe4xDu6WUXV5SLj_oA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 5, 2015 at 11:55 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Jan 2, 2015 at 3:45 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > In short, you can't force 2PC technology on people who aren't using it
> > already; while for those who are using it already, this isn't nearly
> > good enough as-is.
>
> I was involved in some internal discussions related to this patch, so
> I have some opinions on it. The long-term, high-level goal here is to
> facilitate sharding. If we've got a bunch of PostgreSQL servers
> interlinked via postgres_fdw, it should be possible to perform
> transactions on the cluster in such a way that transactions are just
> as atomic, consistent, isolated, and durable as they would be with
> just one server. As far as I know, there is no way to achieve this
> goal through the use of an external transaction manager, because even
> if that external transaction manager guarantees, for every
> transaction, that the transaction either commits on all nodes or rolls
> back on all nodes, there's no way for it to guarantee that other
> transactions won't see some intermediate state where the commit has
> been completed on some nodes but not others. To get that, you need
> some of integration that reaches down to the way snapshots are taken.
>
> I think, though, that it might be worthwhile to first solve the
> simpler problem of figuring out how to ensure that a transaction
> commits everywhere or rolls back everywhere, even if intermediate
> states might still be transiently visible.

Agreed.

> I don't think this patch,
> as currently designed, is equal to that challenge, because
> XACT_EVENT_PRE_COMMIT fires before the transaction is certain to
> commit - PreCommit_CheckForSerializationFailure or PreCommit_Notify
> could still error out. We could have a hook that fires after that,
> but that doesn't solve the problem if a user of that hook can itself
> throw an error. Even if part of the API contract is that it's not
> allowed to do so, the actual attempt to commit the change on the
> remote side can fail due to - e.g. - a network interruption, and
> that's go to be dealt with somehow.
>
>
Tom mentioned
--
in particular it treats the local transaction
asymmetrically from the remote ones, which doesn't seem like a great
idea --- ie, the local transaction could still abort after committing
all the remote ones, leaving you no better off in terms of cross-server
consistency.
--
You have given a specific example of this case. So, let me dry run through
CommitTransaction() after applying my patch.
1899 CallXactCallbacks(XACT_EVENT_PRE_COMMIT);

While processing this event in postgres_fdw's callback
pgfdw_xact_callback() sends a PREPARE TRANSACTION to all the foreign
servers involved. These servers return with their success or failures. Even
if one of them fails, the local transaction is aborted along-with all the
prepared transactions. Only if all the foreign servers succeed we proceed
further.

1925 PreCommit_CheckForSerializationFailure();
1926
1932 PreCommit_Notify();
1933

If any of these function (as you mentioned above), throws errors, the local
transaction will be aborted as well as the remote prepared transactions.
Note, that we haven't yet committed the local transaction (which will be
done below) and also not the remote transactions which are in PREPAREd
state there. Since all the transactions local as well as remote are aborted
in case of error, the data is still consistent. If these steps succeed, we
will proceed ahead.

1934 /* Prevent cancel/die interrupt while cleaning up */
1935 HOLD_INTERRUPTS();
1936
1937 /* Commit updates to the relation map --- do this as late as
possible */
1938 AtEOXact_RelationMap(true);
1939
1940 /*
1941 * set the current transaction state information appropriately
during
1942 * commit processing
1943 */
1944 s->state = TRANS_COMMIT;
1945
1946 /*
1947 * Here is where we really truly commit.
1948 */
1949 latestXid = RecordTransactionCommit();
1950
1951 TRACE_POSTGRESQL_TRANSACTION_COMMIT(MyProc->lxid);
1952
1953 /*
1954 * Let others know about no transaction in progress by me. Note
that this
1955 * must be done _before_ releasing locks we hold and _after_
1956 * RecordTransactionCommit.
1957 */
1958 ProcArrayEndTransaction(MyProc, latestXid);
1959

Local transaction committed. Remote transactions still in PREPAREd state.
Any server (including local) crash or link failure happens here, we leave
the remote transactions dangling in PREPAREd state and manual cleanup will
be required.

1975
1976 CallXactCallbacks(XACT_EVENT_COMMIT);

The postgresql callback pgfdw_xact_callback() commits the PREPAREd
transactions by sending COMMIT TRANSACTION to remote server (my patch). So,
I don't see why would my patch cause inconsistencies. It can cause dangling
PREPAREd transactions and I have already acknowledged that fact.

Am I missing something?

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-01-07 08:06:53 Re: pg_rewind in contrib
Previous Message Tom Lane 2015-01-07 06:28:15 Re: [REVIEW] Re: Fix xpath() to return namespace definitions