Re: [HACKERS] Transactions involving multiple postgres foreign servers

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Transactions involving multiple postgres foreign servers
Date: 2018-05-22 07:41:44
Message-ID: CAD21AoCZxX6qKC3yHZTCbn14i03mGZf47d5qdA22LyvnmDthOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 21, 2018 at 10:42 AM, Tsunakawa, Takayuki
<tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote:
> From: Masahiko Sawada [mailto:sawada(dot)mshk(at)gmail(dot)com]
>> Regarding to API design, should we use 2PC for a distributed
>> transaction if both two or more 2PC-capable foreign servers and
>> 2PC-non-capable foreign server are involved with it? Or should we end
>> up with an error? the 2PC-non-capable server might be either that has
>> 2PC functionality but just disables it or that doesn't have it.
>
>>but I think we also could take
>> the latter way because it doesn't make sense for user even if the
>> transaction commit atomically among not all participants.
>
> I'm for the latter. That is, COMMIT or PREPARE TRANSACTION statement issued from an application reports an error.

I'm not sure that we should end up with an error in such case, but if
we want then we can raise an error when the transaction tries to
modify 2PC-non-capable server after modified 2PC-capable server.

> DBMS, particularly relational DBMS (, and even more particularly Postgres?) places high value on data correctness. So I think transaction atomicity should be preserved, at least by default. If we preferred updatability and performance to data correctness, why don't we change the default value of synchronous_commit to off in favor of performance? On the other hand, if we want to allow 1PC commit when not all FDWs support 2PC, we can add a new GUC parameter like "allow_nonatomic_commit = on", just like synchronous_commit and fsync trade-offs data correctness and performance.

Honestly I'm not sure we should use atomic commit by default at this
point. Because it also means to change default behavior though the
existing users use them without 2PC. But I think control of global
transaction atomicity by GUC parameter would be a good idea. For
example, synchronous_commit = 'global' makes backends wait for
transaction to be resolved globally before returning to the user.

>
>
>> Also, regardless whether we take either way I think it would be
>> better to manage not only 2PC transaction but also non-2PC transaction
>> in the core and add two_phase_commit argument. I think we can use it
>> without breaking existing FDWs. Currently FDWs manage transactions
>> using XactCallback but new APIs being added also manage transactions.
>> I think it might be better if users use either way (using XactCallback
>> or using new APIs) for transaction management rather than use both
>> ways with combination. Otherwise two codes for transaction management
>> will be required: the code that manages foreign transactions using
>> XactCallback for non-2PC transactions and code that manages them using
>> new APIs for 2PC transactions. That would not be easy for FDW
>> developers. So what I imagined for new API is that if FDW developers
>> use new APIs they can use both 2PC and non-2PC transaction, but if
>> they use XactCallback they can use only non-2PC transaction.
>> Any thoughts?
>
> If we add new functions, can't we just add functions whose names are straightforward like PrepareTransaction() and CommitTransaction()? FDWs without 2PC support returns NULL for the function pointer of PrepareTransaction().
>
> This is similar to XA: XA requires each RM to provide function pointers for xa_prepare() and xa_commit(). If we go this way, maybe we could leverage the artifact of postgres_fdw to create the XA library for C/C++. I mean we put transaction control functions in the XA library, and postgres_fdw also uses it. i.e.:
>
> postgres_fdw.so -> libxa.so -> libpq.so
> \-------------/

I might not understand your comment correctly but the current patch is
implemented in such way. The patch introduces new FDW APIs:
PrepareForeignTransaction, EndForeignTransaction,
ResolvePreparedForeignTransaction and GetPreapreId. The postgres core
calls each APIs at appropriate timings while managing each foreign
transactions. FDWs that don't support 2PC set the function pointers of
them to NULL.

Also, regarding the current API design it might not fit to other
databases than PostgreSQL. For example, in MySQL we have to start xa
transaction explicitly using by XA START whereas PostgreSQL can
prepare the transaction that is started by BEGIN TRANSACTION. So in
MySQL global transaction id is required at beginning of xa
transaction. And we have to execute XA END is required before we
prepare or commit it at one phase. So it would be better to define
APIs according to X/Open XA in order to make it more general.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2018-05-22 07:50:00 Re: PostgreSQL and Homomorphic Encryption
Previous Message Tal Glozman 2018-05-22 06:34:18 PostgreSQL and Homomorphic Encryption