RE: Transactions involving multiple postgres foreign servers, take 2

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: 'Masahiko Sawada' <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "ashutosh(dot)bapat(dot)oss(at)gmail(dot)com" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "sawada(dot)mshk(at)gmail(dot)com" <sawada(dot)mshk(at)gmail(dot)com>, "masao(dot)fujii(at)oss(dot)nttdata(dot)com" <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "m(dot)usama(at)gmail(dot)com" <m(dot)usama(at)gmail(dot)com>, "ikedamsh(at)oss(dot)nttdata(dot)com" <ikedamsh(at)oss(dot)nttdata(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "sulamul(at)gmail(dot)com" <sulamul(at)gmail(dot)com>, "alvherre(at)2ndquadrant(dot)com" <alvherre(at)2ndquadrant(dot)com>, "thomas(dot)munro(at)gmail(dot)com" <thomas(dot)munro(at)gmail(dot)com>, "ildar(at)adjust(dot)com" <ildar(at)adjust(dot)com>, "horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp" <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "chris(dot)travers(at)adjust(dot)com" <chris(dot)travers(at)adjust(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "ishii(at)sraoss(dot)co(dot)jp" <ishii(at)sraoss(dot)co(dot)jp>
Subject: RE: Transactions involving multiple postgres foreign servers, take 2
Date: 2020-10-21 09:33:31
Message-ID: TYAPR01MB2990D3E9BEA99BC9B27491C4FE1C0@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
> So what's your opinion?

My opinion is simple and has not changed. Let's clarify and refine the design first in the following areas (others may have pointed out something else too, but I don't remember), before going deeper into the code review.

* FDW interface
New functions so that other FDWs can really implement. Currently, XA seems to be the only model we can rely on to validate the FDW interface.
What FDW function would call what XA function(s)? What should be the arguments for the FEW functions?

* Performance
Parallel prepare and commits on the client backend. The current implementation is untolerable and should not be the first release quality. I proposed the idea.
(If you insist you don't want to anything about this, I have to think you're just rushing for the patch commit. I want to keep Postgres's reputation.)
As part of this, I'd like to see the 2PC's message flow and disk writes (via email and/or on the following wiki.) That helps evaluate the 2PC performance, because it's hard to figure it out in the code of a large patch set. I'm simply imagining what is typically written in database textbooks and research papers. I'm asking this because I saw some discussion in this thread that some new WAL records are added. I was worried that transactions have to write WAL records other than prepare and commit unlike textbook implementations.

Atomic Commit of Distributed Transactions
https://wiki.postgresql.org/wiki/Atomic_Commit_of_Distributed_Transactions

* Query cancellation
As you showed, there's no problem with postgres_fdw?
The cancelability of FDW in general remains a problem, but that can be a separate undertaking.

* Global visibility
This is what Amit-san suggested some times -- "design it before reviewing the current patch." I'm a bit optimistic about this and think this FDW 2PC can be implemented separately as a pure enhancement of FDW. But I also understand his concern. If your (our?) aim is to use this FDW 2PC for sharding, we may have to design the combination of 2PC and visibility first.

> I don’t think we need to stipulate the query cancellation. Anyway I
> guess the facts neither that we don’t stipulate anything about query
> cancellation now nor that postgres_fdw might not be cancellable in
> some situations now are not a reason for not supporting query
> cancellation. If it's a desirable behavior and users want it, we need
> to put an effort to support it as much as possible like we’ve done in
> postgres_fdw. Some FDWs unfortunately might not be able to support it
> only by their functionality but it would be good if we can achieve
> that by combination of PostgreSQL and FDW plugins.

Let me comment on this a bit; this is a bit dangerous idea, I'm afraid. We need to pay attention to the FDW interface and its documentation so that FDW developers can implement what we consider important -- query cancellation in your discussion. "postgres_fdw is OK, so the interface is good" can create interfaces that other FDW developers can't use. That's what Tomas Vondra pointed out several years ago.

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2020-10-21 09:48:56 Re: Parallel copy
Previous Message Amit Kapila 2020-10-21 09:11:07 Re: [HACKERS] logical decoding of two-phase transactions