Re: Transactions involving multiple postgres foreign servers

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: hlinnaka <hlinnaka(at)iki(dot)fi>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transactions involving multiple postgres foreign servers
Date: 2015-07-29 10:58:29
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

On Fri, Jul 17, 2015 at 10:20 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Overall, you seem to have made some significant progress on the design
> since the last version of this patch. There's probably a lot left to
> do, but the design seems more mature now. I haven't read the code,
> but here are some comments based on the email.
Thanks for your comments.

I have incorporated most of your suggestions (marked as Done) in the
attached patch.

> On Thu, Jul 9, 2015 at 6:18 AM, Ashutosh Bapat
> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> > The patch introduces a GUC atomic_foreign_transaction, which when ON
> ensures
> > atomic commit for foreign transactions, otherwise not. The value of this
> > at the time of committing or preparing a local transaction is used. This
> > gives applications the flexibility to choose the behaviour as late in the
> > transaction as possible. This GUC has no effect if there are no foreign
> > servers involved in the transaction.
> Hmm. I'm not crazy about that name, but I don't have a better idea either.
> One thing about this design is that it makes atomicity a property of
> the transaction rather than the server. That is, any given
> transaction is either atomic with respect to all servers or atomic
> with respect to none. You could also design this the other way: each
> server is either configured to do atomic commit, or not. When a
> transaction is committed, it prepares on those servers which are
> configured for it, and then commits the others. So then you can have
> a "partially atomic" transaction where, for example, you transfer
> money from account A to account B (using one or more FDW connections
> that support atomic commit) and also use twitter_fdw to tweet about it
> (using an FDW connection that does NOT support atomic commit). The
> tweet will survive even if the local commit fails, but that's OK. You
could even do this at table granularity: we'll prepare the transaction
> if at least one foreign table involved in the transaction has
> atomic_commit = true.

> In some sense I think this might be a nicer design, because suppose
> you connect to a foreign server and mostly just log stuff but
> occasionally do important things there. In your design, you can do
> this, but you'll need to make sure atomic_foreign_transaction is set
> for the correct set of transactions. But in what I'm proposing here
> we might be able to derive the correct value mostly automatically.
A user may set atomic_foreign_transaction to ON to guarantee atomicity, IOW
it throws error when atomicity can not be guaranteed. Thus if application
accidentally does something to a foreign server, which doesn't support 2PC,
the transaction would abort. A user may set it to OFF (consciously and
takes the responsibility of the result) so as not to use 2PC (probably to
reduce the overheads) even if the foreign server is 2PC compliant. So, I
thought a GUC would be necessary. We can incorporate the behaviour you are
suggesting by having atomic_foreign_transaction accept three values "full"
(ON behaviour), "partial" (behaviour you are describing), "none" (OFF
behaviour). Default value of this GUC would be "partial". Will that be fine?

About table level atomic commit attribute, I agree that some foreign tables
might hold "more critical" data than others from the same server, but I am
not sure whether only that attribute should dictate the atomicity or not. A
transaction collectively might need to be "atomic" even if the individual
tables it modified are not set atomic_commit attribute. So, we need a
transaction level attribute for atomicity, which may be overridden by a
table level attribute. Should we add support to the table level atomicity
setting as version 2+?

> We should consider other possible designs as well; the choices we make
> here may have a significant impact on usability.
I looked at other RBDMSes like IBM's federated database or Oracle. They
support only "full" behaviour as described above with some optimizations
like LRO. But, I would like to hear about other options.

> > Another GUC max_fdw_transactions sets the maximum number of transactions
> > that can be simultaneously prepared on all the foreign servers. This
> limits
> > the memory required for remembering the prepared foreign transactions.
> How about max_prepared_foreign_transactions?


> > Two new FDW hooks are introduced for transaction management.
> > 1. GetPrepareId: to get the prepared transaction identifier for a given
> > foreign server connection. An FDW which doesn't want to support this
> feature
> > can keep this hook undefined (NULL). When defined the hook should return
> a
> > unique identifier for the transaction prepared on the foreign server. The
> > identifier should be unique enough not to conflict with currently
> prepared
> > or future transactions. This point will be clear when discussing phase 2
> of
> > 2PC.
> >
> > 2. HandleForeignTransaction: to end a transaction in specified way. The
> hook
> > should be able to prepare/commit/rollback current running transaction on
> > given connection or commit/rollback a previously prepared transaction.
> This
> > is described in detail while describing phase two of two-phase commit.
> The
> > function is required to return a boolean status of whether the requested
> > operation was successful or not. The function or its minions should not
> > raise any error on failure so as not to interfere with the distributed
> > transaction processing. This point will be clarified more in the
> description
> > below.
> HandleForeignTransaction is not very descriptive, and I think you're
> jamming together things that ought to be separated. Let's have a
> PrepareForeignTransaction and a ResolvePreparedForeignTransaction.

Done, there are three hooks now
1. For preparing a foreign transaction
2. For resolving a prepared foreign transaction
3. For committing/aborting a running foreign transaction (more explanation

> > A foreign server, user mapping corresponding to an unresolved foreign
> > transaction is not allowed to be dropped or altered until the foreign
> > transaction is resolved. This is required to retain the connection
> > properties which need to resolve the prepared transaction on the foreign
> > server.
> I agree with not letting it be dropped, but I think not letting it be
> altered is a serious mistake. Suppose the foreign server dies in a
> fire, its replica is promoted, and we need to re-point the master at
> the replica's hostname or IP.


IP might be fine, but consider altering dbname option or dropping it; we
won't find the prepared foreign transaction in new database. I think we
should at least warn the user that there exist a prepared foreign
transaction on given foreign server or user mapping; better even if we let
FDW decide which options are allowed to be altered when there exists a
foreign prepared transaction. The later requires some surgery in the way we
handle the options.

> > Handling non-atomic foreign transactions
> > ===============================
> > When atomic_foreign_transaction is disabled, one-phase commit protocol is
> > used to commit/rollback the foreign transactions. After the local
> > transaction has committed/aborted, all the foreign transactions on the
> > registered foreign connections are committed or aborted resp. using hook
> > HandleForeignTransaction. Failing to commit a foreign transaction does
> not
> > affect the other foreign transactions; they are still tried to be
> committed
> > (if the local transaction commits).
> Is this a change from the current behavior?

There is no current behaviour defined per say. Each FDW is free to add its
transaction callbacks, which can commit/rollback their respective
transactions at pre-commit time or after the commit. postgres_fdw's
callback tries to commit the foreign transactions on PRE_COMMIT event and
throws error if that fails.

> What if we call the first
> commit handler and it throws an ERROR? Presumably then nothing else
> gets committed, and the transaction overall aborts.
In this case, the fate of transaction depends upon the order in which
foreign transactions are committed, in turn the order in which the foreign
transactions are started. This can result in non-deterministic results. The
patch tries to give it a deterministic behaviour: commit whatever can be
committed and abort rest. This requires EndForeignTransaction
(HandleForeignTransaction in the earlier patch) hook not to raise error.
Although I do not know how to prevent it from throwing an error. We may try
catching the error and not rethrowing them. But I haven't tried that.

The same requirement goes with ResolvePreparedForeignTransaction(). If that
hook throws an error, we end up with unresolved prepared transactions,
which will be committed only when the resolver kicks in.

> > ====
> > PITR may rewind the database to a point before an xid associated with an
> > unresolved foreign transaction. There are two approaches to deal with the
> > situation.
> > 1. Just forget about the unresolved foreign transaction and remove the
> file
> > just like we do for a prepared local transaction. But then the prepared
> > transaction on the foreign server might be left unresolved forever and
> will
> > keep holding the resources.
> > 2. Do not allow PITR to such point. We can not get rid of the
> transaction id
> > without getting rid of prepared foreign transaction. If we do so, we
> might
> > create conflicting files in future and might resolve the transaction with
> > wrong outcome.
> I don't think either of these is correct. The database shouldn't
> behave differently when PITR is used than when it isn't. Otherwise
> you are not doing what it says on the tin: recovering to the chosen
> point in time. I recommend adding a function that forgets about a
> foreign prepared transaction and making it the DBA's job to figure out
> whether to call it in a particular scenario. After all, the remote
> machine might have been subjected to PITR, too. Or maybe not. We
> can't know, so we should give the DBA the tools to clean things up and
> leave it at that.

I have added a built-in pg_fdw_remove() (or any suitable name), which
removes the prepared foreign transaction entry from the memory and disk.
The function needs to be called before attempting PITR. If the recovery
points to a past time without removing file, we abort the recovery. In such
case, a DBA can remove the foreign prepared transaction file manually
before recovery. I have added a hint with that effect in the error message.
Is that enough?

I noticed that the functions pg_fdw_resolve() and pg_fdw_remove() which
resolve or remove unresolved prepared foreign transaction resp. are
effecting changes which can not be rolled back if the transaction which ran
these functions rolled back. These need to be converted into SQL command
like ROLLBACK PREPARED which can't be run within a transaction.

> > IIUC LRO, the patch uses the local transaction as last resource, which is
> > always present. The fate of foreign transaction is decided by the fate of
> > the local transaction, which is not required to be prepared per say.
> There
> > is more relevant note later.
> Personally, I think that's perfectly fine. We could do more later if
> we wanted to, but there's plenty to like here without that.


> >> Just to be clear: you also need two-phase commit if the transaction
> >> updated anything in the local server and in even one foreign server.
> >
> > Any local transaction involving a foreign sever transaction uses
> two-phase
> > commit for the foreign transaction. The local transaction is not prepared
> > per say. However, we should be able to optimize a case, when there are no
> > local changes. I am not able to find a way to deduce that there was no
> local
> > change, so I have left that case in this patch. Is there a way to know
> > whether a local transaction changed something locally or not?
> You might check whether it wrote any WAL. There's a global variable
> for that somewhere; RecordTransactionCommit() uses it. But I don't
> think this is an essential optimization for v1, either.


> > I have used approach similar to pg_twophase, but implemented it as a
> > separate code, as the requirements differ. But, I would like to minimize
> > code by unifying both, if we finalise this design. Suggestions in this
> > regard will be very helpful.
> -1 for trying to unify those unless it's really clear that it's a good
> idea. I bet it's not.


> >> Or you could insert/update the rows in the catalog with xmin=FrozenXid,
> >> ignoring MVCC. Not sure how well that would work.
> >
> > I am not aware how to do that. Do we have any precedence in the code.
> No. I bet that's also a bad idea. A non-transactional table is a
> good idea that has been proposed before, but let's not try to invent
> it in this patch.


> --
> Robert Haas
> EnterpriseDB:
> The Enterprise PostgreSQL Company

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
pg_fdw_transact.patch text/x-patch 206.5 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Oleksii Kliukin 2015-07-29 11:06:34 Re: REVOKE [ADMIN OPTION FOR] ROLE
Previous Message Shulgin, Oleksandr 2015-07-29 10:44:36 Re: deparsing utility commands