Re: Transactions involving multiple postgres foreign servers, take 2

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
Cc: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "ashutosh(dot)bapat(dot)oss(at)gmail(dot)com" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "m(dot)usama(at)gmail(dot)com" <m(dot)usama(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "sulamul(at)gmail(dot)com" <sulamul(at)gmail(dot)com>, "alvherre(at)2ndquadrant(dot)com" <alvherre(at)2ndquadrant(dot)com>, "thomas(dot)munro(at)gmail(dot)com" <thomas(dot)munro(at)gmail(dot)com>, "ildar(at)adjust(dot)com" <ildar(at)adjust(dot)com>, "horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp" <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "chris(dot)travers(at)adjust(dot)com" <chris(dot)travers(at)adjust(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "ishii(at)sraoss(dot)co(dot)jp" <ishii(at)sraoss(dot)co(dot)jp>
Subject: Re: Transactions involving multiple postgres foreign servers, take 2
Date: 2021-06-30 01:05:45
Message-ID: CAD21AoBaTc8M7D1iTvBxrfjQw8B3AgFTnjfWcXPUhgu4T6K8jw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 25, 2021 at 9:53 AM Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com> wrote:
>
> Hi Jamison-san, sawada-san,
>
> Thanks for testing!
>
> FWIF, I tested using pgbench with "--rate=" option to know the server
> can execute transactions with stable throughput. As sawada-san said,
> the latest patch resolved second phase of 2PC asynchronously. So,
> it's difficult to control the stable throughput without "--rate=" option.
>
> I also worried what I should do when the error happened because to increase
> "max_prepared_foreign_transaction" doesn't work. Since too overloading may
> show the error, is it better to add the case to the HINT message?
>
> BTW, if sawada-san already develop to run the resolver processes in parallel,
> why don't you measure performance improvement? Although Robert-san,
> Tunakawa-san and so on are discussing what architecture is best, one
> discussion point is that there is a performance risk if adopting asynchronous
> approach. If we have promising solutions, I think we can make the discussion
> forward.

Yeah, if we can asynchronously resolve the distributed transactions
without worrying about max_prepared_foreign_transaction error, it
would be good. But we will need synchronous resolution at some point.
I think we at least need to discuss it at this point.

I've attached the new version patch that incorporates the comments
from Fujii-san and Ikeda-san I got so far. We launch a resolver
process per foreign server, committing prepared foreign transactions
on foreign servers in parallel. To get a better performance based on
the current architecture, we can have multiple resolver processes per
foreign server but it seems not easy to tune it in practice. Perhaps
is it better if we simply have a pool of resolver processes and we
assign a resolver process to the resolution of one distributed
transaction one by one? That way, we need to launch resolver processes
as many as the concurrent backends using 2PC.

> In my understanding, there are three improvement idea. First is that to make
> the resolver processes run in parallel. Second is that to send "COMMIT/ABORT
> PREPARED" remote servers in bulk. Third is to stop syncing the WAL
> remove_fdwxact() after resolving is done, which I addressed in the mail sent
> at June 3rd, 13:56. Since third idea is not yet discussed, there may
> be my misunderstanding.

Yes, those optimizations are promising. On the other hand, they could
introduce complexity to the code and APIs. I'd like to keep the first
version simple. I think we need to discuss them at this stage but can
leave the implementation of both parallel execution and batch
execution as future improvements.

For the third idea, I think the implementation was wrong; it removes
the state file then flushes the WAL record. I think these should be
performed in the reverse order. Otherwise, FdwXactState entry could be
left on the standby if the server crashes between them. I might be
missing something though.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachment Content-Type Size
v37-0008-Documentation-update.patch application/octet-stream 49.9 KB
v37-0006-postgres_fdw-marks-foreign-transaction-as-modifi.patch application/octet-stream 4.2 KB
v37-0007-Add-GetPrepareId-API.patch application/octet-stream 4.3 KB
v37-0009-Add-regression-tests-for-foreign-twophase-commit.patch application/octet-stream 44.4 KB
v37-0005-Prepare-foreign-transactions-at-commit-time.patch application/octet-stream 18.5 KB
v37-0004-postgres_fdw-supports-prepare-API.patch application/octet-stream 8.8 KB
v37-0002-postgres_fdw-supports-commit-and-rollback-APIs.patch application/octet-stream 19.6 KB
v37-0001-Introduce-transaction-manager-for-foreign-transa.patch application/octet-stream 12.5 KB
v37-0003-Support-two-phase-commit-for-foreign-transaction.patch application/octet-stream 153.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-06-30 01:10:00 Re: Speed up pg_checksums in cases where checksum already set
Previous Message Kyotaro Horiguchi 2021-06-30 00:55:58 Re: ERROR: "ft1" is of the wrong type.