Re: Transactions involving multiple postgres foreign servers, take 2

From: "ikedamsh(at)oss(dot)nttdata(dot)com" <ikedamsh(at)oss(dot)nttdata(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Zhihong Yu <zyu(at)yugabyte(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "ashutosh(dot)bapat(dot)oss(at)gmail(dot)com" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "m(dot)usama(at)gmail(dot)com" <m(dot)usama(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "sulamul(at)gmail(dot)com" <sulamul(at)gmail(dot)com>, "alvherre(at)2ndquadrant(dot)com" <alvherre(at)2ndquadrant(dot)com>, "thomas(dot)munro(at)gmail(dot)com" <thomas(dot)munro(at)gmail(dot)com>, "ildar(at)adjust(dot)com" <ildar(at)adjust(dot)com>, "horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp" <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "chris(dot)travers(at)adjust(dot)com" <chris(dot)travers(at)adjust(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "ishii(at)sraoss(dot)co(dot)jp" <ishii(at)sraoss(dot)co(dot)jp>
Subject: Re: Transactions involving multiple postgres foreign servers, take 2
Date: 2021-06-04 06:58:47
Message-ID: B97179DE-5573-4DC0-918B-14AAF6F8DF60@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 2021/06/04 12:28、Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>のメール:
>
> On Thu, Jun 3, 2021 at 1:56 PM Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com <mailto:ikedamsh(at)oss(dot)nttdata(dot)com>> wrote:
>>
>>
>>
>> On 2021/05/25 21:59, Masahiko Sawada wrote:
>>> On Fri, May 21, 2021 at 5:48 PM Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com> wrote:
>>>>
>>>> On 2021/05/21 13:45, Masahiko Sawada wrote:
>>>>>
>>>>> Yes. We also might need to be careful about the order of foreign
>>>>> transaction resolution. I think we need to resolve foreign> transactions in arrival order at least within a foreign server.
>>>>
>>>> I agree it's better.
>>>>
>>>> (Although this is my interest...)
>>>> Is it necessary? Although this idea seems to be for atomic visibility,
>>>> 2PC can't realize that as you know. So, I wondered that.
>>>
>>> I think it's for fairness. If a foreign transaction arrived earlier
>>> gets put off so often for other foreign transactions arrived later due
>>> to its index in FdwXactCtl->xacts, it’s not understandable for users
>>> and not fair. I think it’s better to handle foreign transactions in
>>> FIFO manner (although this problem exists even in the current code).
>>
>> OK, thanks.
>>
>>
>> On 2021/05/21 12:45, Masahiro Ikeda wrote:
>>> On 2021/05/21 10:39, Masahiko Sawada wrote:
>>>> On Thu, May 20, 2021 at 1:26 PM Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
>> wrote:
>>>> How much is the performance without those 2PC patches and with the
>>>> same workload? i.e., how fast is the current postgres_fdw that uses
>>>> XactCallback?
>>>
>>> OK, I'll test.
>>
>> The test results are followings. But, I couldn't confirm the performance
>> improvements of 2PC patches though I may need to be changed the test condition.
>>
>> [condition]
>> * 1 coordinator and 3 foreign servers
>> * There are two custom scripts which access different two foreign servers per
>> transaction
>>
>> ``` fxact_select.pgbench
>> BEGIN;
>> SELECT * FROM part:p1 WHERE id = :id;
>> SELECT * FROM part:p2 WHERE id = :id;
>> COMMIT;
>> ```
>>
>> ``` fxact_update.pgbench
>> BEGIN;
>> UPDATE part:p1 SET md5 = md5(clock_timestamp()::text) WHERE id = :id;
>> UPDATE part:p2 SET md5 = md5(clock_timestamp()::text) WHERE id = :id;
>> COMMIT;
>> ```
>>
>> [results]
>>
>> I have tested three times.
>> Performance difference seems to be within the range of errors.
>>
>> # 6d0eb38557 with 2pc patches(v36) and foreign_twophase_commit = disable
>> - fxact_update.pgbench
>> 72.3, 74.9, 77.5 TPS => avg 74.9 TPS
>> 110.5, 106.8, 103.2 ms => avg 106.8 ms
>>
>> - fxact_select.pgbench
>> 1767.6, 1737.1, 1717.4 TPS => avg 1740.7 TPS
>> 4.5, 4.6, 4.7 ms => avg 4.6ms
>>
>> # 6d0eb38557 without 2pc patches
>> - fxact_update.pgbench
>> 76.5, 70.6, 69.5 TPS => avg 72.2 TPS
>> 104.534 + 113.244 + 115.097 => avg 111.0 ms
>>
>> -fxact_select.pgbench
>> 1810.2, 1748.3, 1737.2 TPS => avg 1765.2 TPS
>> 4.2, 4.6, 4.6 ms=> 4.5 ms
>>
>
> Thank you for testing!
>
> I think the result shows that managing foreign transactions on the
> core side would not be a problem in terms of performance.
>
>>
>>
>>
>>
>> # About the bottleneck of the resolver process
>>
>> I investigated the performance bottleneck of the resolver process using perf.
>> The main bottleneck is the following functions.
>>
>> 1st. 42.8% routine->CommitForeignTransaction()
>> 2nd. 31.5% remove_fdwxact()
>> 3rd. 10.16% CommitTransaction()
>>
>> 1st and 3rd problems can be solved by parallelizing resolver processes per
>> remote servers. But, I wondered that the idea, which backends call also
>> "COMMIT/ABORT PREPARED" and the resolver process only takes changes of
>> resolving in-doubt foreign transactions, is better. In many cases, I think
>> that the number of connections is much greater than the number of remote
>> servers. If so, the parallelization is not enough.
>>
>> So, I think the idea which backends execute "PREPARED COMMIT" synchronously is
>> better. The citus has the 2PC feature and backends send "PREPARED COMMIT" in
>> the extension. So, this idea is not bad.
>
> Thank you for pointing it out. This idea has been proposed several
> times and there were discussions. I'd like to summarize the proposed
> ideas and those pros and cons before replying to your other comments.
>
> There are 3 ideas. After backend both prepares all foreign transaction
> and commit the local transaction,
>
> 1. the backend continues attempting to commit all prepared foreign
> transactions until all of them are committed.
> 2. the backend attempts to commit all prepared foreign transactions
> once. If an error happens, leave them for the resolver.
> 3. the backend asks the resolver that launched per foreign server to
> commit the prepared foreign transactions (and backend waits or doesn't
> wait for the commit completion depending on the setting).
>
> With ideas 1 and 2, since the backend itself commits all foreign
> transactions the resolver process cannot be a bottleneck, and probably
> the code can get more simple as backends don't need to communicate
> with resolver processes.
>
> However, those have two problems we need to deal with:

Thanks for sharing the summarize. I understood there are problems related to
FDW implementation.

> First, users could get an error if an error happens during the backend
> committing prepared foreign transaction but the local transaction is
> already committed and some foreign transactions could also be
> committed, confusing users. There were two opinions to this problem:
> FDW developers should be responsible for writing FDW code such that
> any error doesn't happen during committing foreign transactions, and
> users can accept that confusion since an error could happen after
> writing the commit WAL even today without this 2PC feature. For the
> former point, I'm not sure it's always doable since even palloc()
> could raise an error and it seems hard to require all FDW developers
> to understand all possible paths of raising an error. And for the
> latter point, that's true but I think those cases are
> should-not-happen cases (i.g., rare cases) whereas the likelihood of
> an error during committing prepared transactions is not low (e.g., by
> network connectivity problem). I think we need to assume that that is
> not a rare case.

Hmm… Sorry, I don’t have any good ideas now.

If anything, I’m on second side which users accept the confusion though
let users know a error happens before local commit is done or not is necessary
because if the former case, users will execute the same query again.

> The second problem is whether we can cancel committing foreign
> transactions by pg_cancel_backend() (or pressing Ctl-c). If the
> backend process commits prepared foreign transactions, it's FDW
> developers' responsibility to write code that is interruptible. I’m
> not sure it’s feasible for drivers for other databases.

Sorry, my understanding is not clear.

After all prepares are done, the foreign transactions will be committed.
So, does this mean that FDW must leave the unresolved transaction to the transaction
resolver and show some messages like “Since the transaction is already committed,
the transaction will be resolved in background" ?

> Idea 3 is proposed to deal with those problems. By having separate
> processes, resolver processes, committing prepared foreign
> transactions, we and FDW developers don't need to worry about those
> two problems.
>
> However as Ikeda-san shared the performance results, idea 3 is likely
> to have a performance problem since resolver processes can easily be
> bottle-neck. Moreover, with the current patch, since we asynchronously
> commit foreign prepared transactions, if many concurrent clients use
> 2PC, reaching max_foreign_prepared_transactions, transactions end up
> with an error.
>
> Through the long discussion on this thread, I've been thought we got a
> consensus on idea 3 but sometimes ideas 1 and 2 are proposed again for
> dealing with the performance problem. Idea 1 and 2 are also good and
> attractive, but I think we need to deal with the two problems first if
> we go with one of those ideas. To be honest, I'm really not sure it's
> good if we make those things FDW developers responsibility.
>
> As long as we commit foreign prepared transactions asynchronously and
> there is max_foreign_prepared_transactions limit, it's possible that
> committing those transactions could not keep up. Maybe the same is
> true for a case where the client heavily uses 2PC and asynchronously
> commits prepared transactions. If committing prepared transactions
> doesn't keep up with preparing transactions, the system reaches
> max_prepared_transactions.
>
> With the current patch, we commit prepared foreign transactions
> asynchronously. But maybe we need to compare the performance of ideas
> 1 (and 2) to idea 3 with synchronous foreign transaction resolution.

OK, I understood the consensus is 3rd one. I agree it since I don’t have any solutions
For the problems related 1st and 2nd. If I find them, I’ll share you.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-06-04 07:21:35 Re: Duplicate history file?
Previous Message Pavel Stehule 2021-06-04 06:58:15 Re: security_definer_search_path GUC