Re: Transactions involving multiple postgres foreign servers, take 2

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Muhammad Usama <m(dot)usama(at)gmail(dot)com>, Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, amul sul <sulamul(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Ildar Musin <ildar(at)adjust(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Chris Travers <chris(dot)travers(at)adjust(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
Subject: Re: Transactions involving multiple postgres foreign servers, take 2
Date: 2020-10-06 13:52:11
Message-ID: CA+fd4k6_0dUV189KJ_mvmS6z6ejRX=zMKBDt4rjLH+UwpOryCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2 Oct 2020 at 18:20, tsunakawa(dot)takay(at)fujitsu(dot)com
<tsunakawa(dot)takay(at)fujitsu(dot)com> wrote:
>
> From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
> > You proposed the first idea
> > to avoid such a situation that FDW implementor can write the code
> > while trying to reduce the possibility of errors happening as much as
> > possible, for example by usingpalloc_extended(MCXT_ALLOC_NO_OOM) and
> > hash_search(HASH_ENTER_NULL) but I think it's not a comprehensive
> > solution. They might miss, not know it, or use other functions
> > provided by the core that could lead an error.
>
> We can give the guideline in the manual, can't we? It should not be especially difficult for the FDW implementor compared to other Postgres's extensibility features that have their own rules -- table/index AM, user-defined C function, trigger function in C, user-defined data types, hooks, etc. And, the Postgres functions that the FDW implementor would use to implement their commit will be very limited, won't they? Because most of the commit processing is performed in the resource manager's library (e.g. Oracle and MySQL client library.)

Yeah, if we think FDW implementors properly implement these APIs while
following the guideline, giving the guideline is a good idea. But I’m
not sure all FDW implementors are able to do that and even if the user
uses an FDW whose transaction APIs don’t follow the guideline, the
user won’t realize it. IMO it’s better to design the feature while not
depending on external programs for reliability (correctness?) of this
feature, although I might be too worried.

>
>
> > Another idea is to use
> > PG_TRY() and PG_CATCH(). IIUC with this idea, FDW implementor catches
> > an error but ignores it rather than rethrowing by PG_RE_THROW() in
> > order to return the control to the core after an error. I’m really not
> > sure it’s a correct usage of those macros. In addition, after
> > returning to the core, it will retry to resolve the same or other
> > foreign transactions. That is, after ignoring an error, the core needs
> > to continue working and possibly call transaction callbacks of other
> > FDW implementations.
>
> No, not ignore the error. The FDW can emit a WARNING, LOG, or NOTICE message, and return an error code to TM. TM can also emit a message like:
>
> WARNING: failed to commit part of a transaction on the foreign server 'XXX'
> HINT: The server continues to try committing the remote transaction.
>
> Then TM asks the resolver to take care of committing the remote transaction, and acknowledge the commit success to the client.

It seems like if failed to resolve, the backend would return an
acknowledgment of COMMIT to the client and the resolver process
resolves foreign prepared transactions in the background. So we can
ensure that the distributed transaction is completed at the time when
the client got an acknowledgment of COMMIT if 2nd phase of 2PC is
successfully completed in the first attempts. OTOH, if it failed for
whatever reason, there is no such guarantee. From an optimistic
perspective, i.g., the failures are unlikely to happen, it will work
well but IMO it’s not uncommon to fail to resolve foreign transactions
due to network issue, especially in an unreliable network environment
for example geo-distributed database. So I think it will end up
requiring the client to check if preceding distributed transactions
are completed or not in order to see the results of these
transactions.

We could retry the foreign transaction resolution before leaving it to
the resolver process but the problem that the core continues trying to
resolve foreign transactions without neither transaction aborting and
rethrowing even after an error still remains.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-10-06 14:29:42 Re: Yet another fast GiST build
Previous Message Magnus Hagander 2020-10-06 13:48:45 Re: [doc] clarify behaviour of pg_dump's -t/--table option with non-tables