RE: Transactions involving multiple postgres foreign servers, take 2

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: 'Masahiko Sawada' <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Muhammad Usama <m(dot)usama(at)gmail(dot)com>, Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, amul sul <sulamul(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Ildar Musin <ildar(at)adjust(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Chris Travers <chris(dot)travers(at)adjust(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
Subject: RE: Transactions involving multiple postgres foreign servers, take 2
Date: 2020-10-02 09:20:32
Message-ID: TYAPR01MB2990CDE7F841028A44A85C95FE310@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
> You proposed the first idea
> to avoid such a situation that FDW implementor can write the code
> while trying to reduce the possibility of errors happening as much as
> possible, for example by usingpalloc_extended(MCXT_ALLOC_NO_OOM) and
> hash_search(HASH_ENTER_NULL) but I think it's not a comprehensive
> solution. They might miss, not know it, or use other functions
> provided by the core that could lead an error.

We can give the guideline in the manual, can't we? It should not be especially difficult for the FDW implementor compared to other Postgres's extensibility features that have their own rules -- table/index AM, user-defined C function, trigger function in C, user-defined data types, hooks, etc. And, the Postgres functions that the FDW implementor would use to implement their commit will be very limited, won't they? Because most of the commit processing is performed in the resource manager's library (e.g. Oracle and MySQL client library.)

(Before that, the developer of server-side modules is not given any information on what functions (like palloc) are available in the manual, is he?)

> Another idea is to use
> PG_TRY() and PG_CATCH(). IIUC with this idea, FDW implementor catches
> an error but ignores it rather than rethrowing by PG_RE_THROW() in
> order to return the control to the core after an error. I’m really not
> sure it’s a correct usage of those macros. In addition, after
> returning to the core, it will retry to resolve the same or other
> foreign transactions. That is, after ignoring an error, the core needs
> to continue working and possibly call transaction callbacks of other
> FDW implementations.

No, not ignore the error. The FDW can emit a WARNING, LOG, or NOTICE message, and return an error code to TM. TM can also emit a message like:

WARNING: failed to commit part of a transaction on the foreign server 'XXX'
HINT: The server continues to try committing the remote transaction.

Then TM asks the resolver to take care of committing the remote transaction, and acknowledge the commit success to the client. The relevant return codes of xa_commit() are:

--------------------------------------------------
[XAER_RMERR]
An error occurred in committing the work performed on behalf of the transaction
branch and the branch’s work has been rolled back. Note that returning this error
signals a catastrophic event to a transaction manager since other resource
managers may successfully commit their work on behalf of this branch. This error
should be returned only when a resource manager concludes that it can never
commit the branch and that it cannot hold the branch’s resources in a prepared
state. Otherwise, [XA_RETRY] should be returned.

[XAER_RMFAIL]
An error occurred that makes the resource manager unavailable.
--------------------------------------------------

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2020-10-02 11:44:54 Re: Error code missing for "wrong length of inner sequence" error
Previous Message Justin Pryzby 2020-10-02 09:13:54 Re: please update ps display for recovery checkpoint