Re: Transactions involving multiple postgres foreign servers, take 2

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "ashutosh(dot)bapat(dot)oss(at)gmail(dot)com" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "m(dot)usama(at)gmail(dot)com" <m(dot)usama(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "sulamul(at)gmail(dot)com" <sulamul(at)gmail(dot)com>, "alvherre(at)2ndquadrant(dot)com" <alvherre(at)2ndquadrant(dot)com>, "thomas(dot)munro(at)gmail(dot)com" <thomas(dot)munro(at)gmail(dot)com>, "ildar(at)adjust(dot)com" <ildar(at)adjust(dot)com>, "horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp" <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "chris(dot)travers(at)adjust(dot)com" <chris(dot)travers(at)adjust(dot)com>, "ishii(at)sraoss(dot)co(dot)jp" <ishii(at)sraoss(dot)co(dot)jp>
Subject: Re: Transactions involving multiple postgres foreign servers, take 2
Date: 2021-06-14 16:08:51
Message-ID: CA+TgmoZWYaWxhMG3ZYacoG=FOED3cPw2iaAQ2DoxFj8+YivyTA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 13, 2021 at 10:04 PM tsunakawa(dot)takay(at)fujitsu(dot)com
<tsunakawa(dot)takay(at)fujitsu(dot)com> wrote:
> I know sending a commit request may get an error from various underlying functions, but we're talking about the client side, not the Postgres's server side that could unexpectedly ereport(ERROR) somewhere. So, the new FDW commit routine won't lose control and can return an error code as its return value. For instance, the FDW commit routine for DBMS-X would typically be:
>
> int
> DBMSXCommit(...)
> {
> int ret;
>
> /* extract info from the argument to pass to xa_commit() */
>
> ret = DBMSX_xa_commit(...);
> /* This is the actual commit function which is exposed to the app server (e.g. Tuxedo) through the xa_commit() interface */
>
> /* map xa_commit() return values to the corresponding return values of the FDW commit routine */
> switch (ret)
> {
> case XA_RMERR:
> ret = ...;
> break;
> ...
> }
>
> return ret;
> }

Well, we're talking about running this commit routine from within
CommitTransaction(), right? So I think it is in fact running in the
server. And if that's so, then you have to worry about how to make it
respond to interrupts. You can't just call some functions
DBMSX_xa_commit() and wait for infinite time for it to return. Look at
pgfdw_get_result() for an example of what real code to do this looks
like.

> So, we need to design how commit behaves from the user's perspective. That's the functional design. We should figure out what's the desirable response of commit first, and then see if we can implement it or have to compromise in some way. I think we can reference the X/Open TX standard and/or JTS (Java Transaction Service) specification (I haven't had a chance to read them yet, though.) Just in case we can't find the requested commit behavior in the volcano case from those specifications, ... (I'm hesitant to say this because it may be hard,) it's desirable to follow representative products such as Tuxedo and GlassFish (the reference implementation of Java EE specs.)

Honestly, I am not quite sure what any specification has to say about
this. We're talking about what happens when a user does something with
a foreign table and then type COMMIT. That's all about providing a set
of behaviors that are consistent with how PostgreSQL works in other
situations. You can't negotiate away the requirement to handle errors
in a way that works with PostgreSQL's infrastructure, or the
requirement that any length operation handle interrupts properly, by
appealing to a specification.

> Concurrent transactions are serialized at the resolver. I heard that the current patch handles 2PC like this: the TM (transaction manager in Postgres core) requests prepare to the resolver, the resolver sends prepare to the remote server and wait for reply, the TM gets back control from the resolver, TM requests commit to the resolver, the resolver sends commit to the remote server and wait for reply, and TM gets back control. The resolver handles one transaction at a time.

That sounds more like a limitation of the present implementation than
a fundamental problem. We shouldn't reject the idea of having a
resolver process handle this just because the initial implementation
might be slow. If there's no fundamental problem with the idea,
parallelism and concurrency can be improved in separate patches at a
later time. It's much more important at this stage to reject ideas
that are not theoretically sound.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-06-14 16:10:28 Re: a path towards replacing GEQO with something better
Previous Message Robert Haas 2021-06-14 15:52:52 Re: Race condition in recovery?