Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
Date: 2018-06-11 04:53:08
Message-ID: CAD21AoDTAGj5MohEPQps0GjmF413pO9uwykqZ36g7JMh0a+UjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 5, 2018 at 7:13 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Sat, May 26, 2018 at 12:25 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Fri, May 18, 2018 at 11:21 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>> Regarding to API design, should we use 2PC for a distributed
>>> transaction if both two or more 2PC-capable foreign servers and
>>> 2PC-non-capable foreign server are involved with it? Or should we end
>>> up with an error? the 2PC-non-capable server might be either that has
>>> 2PC functionality but just disables it or that doesn't have it.
>>
>> It seems to me that this is functionality that many people will not
>> want to use. First, doing a PREPARE and then a COMMIT for each FDW
>> write transaction is bound to be more expensive than just doing a
>> COMMIT. Second, because the default value of
>> max_prepared_transactions is 0, this can only work at all if special
>> configuration has been done on the remote side. Because of the second
>> point in particular, it seems to me that the default for this new
>> feature must be "off". It would make to ship a default configuration
>> of PostgreSQL that doesn't work with the default configuration of
>> postgres_fdw, and I do not think we want to change the default value
>> of max_prepared_transactions. It was changed from 5 to 0 a number of
>> years back for good reason.
>
> I'm not sure that many people will not want to use this feature
> because it seems to me that there are many people who don't want to
> use the database that is missing transaction atomicity. But I agree
> that this feature should not be enabled by default as we disable 2PC
> by default.
>
>>
>> So, I think the question could be broadened a bit: how you enable this
>> feature if you want it, and what happens if you want it but it's not
>> available for your choice of FDW? One possible enabling method is a
>> GUC (e.g. foreign_twophase_commit). It could be true/false, with true
>> meaning use PREPARE for all FDW writes and fail if that's not
>> supported, or it could be three-valued, like require/prefer/disable,
>> with require throwing an error if PREPARE support is not available and
>> prefer using PREPARE where available but without failing when it isn't
>> available. Another possibility could be to make it an FDW option,
>> possibly capable of being set at multiple levels (e.g. server or
>> foreign table). If any FDW involved in the transaction demands
>> distributed 2PC semantics then the whole transaction must have those
>> semantics or it fails. I was previous leaning toward the latter
>> approach, but I guess now the former approach is sounding better. I'm
>> not totally certain I know what's best here.
>>
>
> I agree that the former is better. That way, we also can control that
> parameter at transaction level. If we allow the 'prefer' behavior we
> need to manage not only 2PC-capable foreign server but also
> 2PC-non-capable foreign server. It requires all FDW to call the
> registration function. So I think two-values parameter would be
> better.
>
> BTW, sorry for late submitting the updated patch. I'll post the
> updated patch in this week but I'd like to share the new APIs design
> beforehand.

Attached updated patches.

I've changed the new APIs to 5 functions and 1 registration function
because the rollback API can be called by both backend process and
resolver process which is not good design. The latest version patches
incorporated all comments I got except for documentation about overall
point to user. I'm considering what contents I should document it
there. I'll write it during the code patch is getting reviewed. The
basic design of new patches is almost same as the previous mail I
sent.

I introduced 5 new FDW APIs: PrepareForeignTransaction,
CommitForeignTransaction, RollbackForeignTransaction,
ResolveForeignTransaction and IsTwophaseCommitEnabled.
ResolveForeignTransaction is normally called by resolver process
whereas other four functions are called by backend process. Also I
introduced a registration function FdwXactRegisterForeignTransaction.
FDW that wish to support atomic commit requires to call this function
when a transaction opens on the foreign server. Registered foreign
transactions are controlled by the foreign transaction manager of
Postgres core and calls APIs at appropriate timing. It means that the
foreign transaction manager controls only foreign servers that are
capable of 2PC. For 2PC-non-capable foreign server, FDW must use
XactCallback to control the foreign transaction. 2PC is used at commit
when the distributed transaction modified data on two or more servers
including local server and user requested by foreign_twophase_commit
GUC parameter. All foreign transactions are prepared during pre-commit
and then commit locally. After committed locally wait for resolver
process to resolve all prepared foreign transactions. The waiting
backend is released (that is, returns the prompt to client) either
when all foreign transactions are resolved or when user requested to
waiting. If 2PC is not required, a foreign transaction is committed
during pre-commit phase of local transaction. IsTwophaseCommitEnabled
is called whenever the transaction begins to modify data on foreign
server. This is required to track whether the transaction modified
data on the foreign server that doesn't support or enable 2PC.

Atomic commit among multiple foreign servers is crash-safe. If the
coordinator server crashes during atomic commit, the foreign
transaction participants and their status are recovered during WAL
apply. Recovered foreign transactions are in doubt-state, aka dangling
transactions. If database has such transactions resolver process
periodically tries to resolve them.

I'll register this patch to next CF. Feedback is very welcome.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
0001-Keep-track-of-writing-on-non-temporary-relation_v16.patch application/octet-stream 1.9 KB
0002-Support-atomic-commit-among-multiple-foreign-servers_v16.patch application/octet-stream 198.9 KB
0003-postgres_fdw-supports-atomic-commit-APIs_v16.patch application/octet-stream 49.8 KB
0004-Add-regression-tests-for-atomic-commit_v16.patch application/octet-stream 8.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2018-06-11 04:55:41 Re: Concurrency bug in UPDATE of partition-key
Previous Message Amit Khandekar 2018-06-11 04:49:38 Re: Concurrency bug in UPDATE of partition-key