Re: Transactions involving multiple postgres foreign servers, take 2

From: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Zhihong Yu <zyu(at)yugabyte(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "ashutosh(dot)bapat(dot)oss(at)gmail(dot)com" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "m(dot)usama(at)gmail(dot)com" <m(dot)usama(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "sulamul(at)gmail(dot)com" <sulamul(at)gmail(dot)com>, "alvherre(at)2ndquadrant(dot)com" <alvherre(at)2ndquadrant(dot)com>, "thomas(dot)munro(at)gmail(dot)com" <thomas(dot)munro(at)gmail(dot)com>, "ildar(at)adjust(dot)com" <ildar(at)adjust(dot)com>, "horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp" <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "chris(dot)travers(at)adjust(dot)com" <chris(dot)travers(at)adjust(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "ishii(at)sraoss(dot)co(dot)jp" <ishii(at)sraoss(dot)co(dot)jp>
Subject: Re: Transactions involving multiple postgres foreign servers, take 2
Date: 2021-05-21 08:48:08
Message-ID: 5b80c9a3-2ce8-1c2b-65a3-e2b82b95331e@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021/05/21 13:45, Masahiko Sawada wrote:
> On Fri, May 21, 2021 at 12:45 PM Masahiro Ikeda
> <ikedamsh(at)oss(dot)nttdata(dot)com> wrote:
>>
>>
>>
>> On 2021/05/21 10:39, Masahiko Sawada wrote:
>>> On Thu, May 20, 2021 at 1:26 PM Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com> wrote:
>>>>
>>>>
>>>> On 2021/05/11 13:37, Masahiko Sawada wrote:
>>>>> I've attached the updated patches that incorporated comments from
>>>>> Zhihong and Ikeda-san.
>>>>
>>>> Thanks for updating the patches!
>>>>
>>>>
>>>> I have other comments including trivial things.
>>>>
>>>>
>>>> a. about "foreign_transaction_resolver_timeout" parameter
>>>>
>>>> Now, the default value of "foreign_transaction_resolver_timeout" is 60 secs.
>>>> Is there any reason? Although the following is minor case, it may confuse some
>>>> users.
>>>>
>>>> Example case is that
>>>>
>>>> 1. a client executes transaction with 2PC when the resolver is processing
>>>> FdwXactResolverProcessInDoubtXacts().
>>>>
>>>> 2. the resolution of 1st transaction must be waited until the other
>>>> transactions for 2pc are executed or timeout.
>>>>
>>>> 3. if the client check the 1st result value, it should wait until resolution
>>>> is finished for atomic visibility (although it depends on the way how to
>>>> realize atomic visibility.) The clients may be waited
>>>> foreign_transaction_resolver_timeout". Users may think it's stale.
>>>>
>>>> Like this situation can be observed after testing with pgbench. Some
>>>> unresolved transaction remains after benchmarking.
>>>>
>>>> I assume that this default value refers to wal_sender, archiver, and so on.
>>>> But, I think this parameter is more like "commit_delay". If so, 60 seconds
>>>> seems to be big value.
>>>
>>> IIUC this situation seems like the foreign transaction resolution is
>>> bottle-neck and doesn’t catch up to incoming resolution requests. But
>>> how foreignt_transaction_resolver_timeout relates to this situation?
>>> foreign_transaction_resolver_timeout controls when to terminate the
>>> resolver process that doesn't have any foreign transactions to
>>> resolve. So if we set it several milliseconds, resolver processes are
>>> terminated immediately after each resolution, imposing the cost of
>>> launching resolver processes on the next resolution.
>>
>> Thanks for your comments!
>>
>> No, this situation is not related to the foreign transaction resolution is
>> bottle-neck or not. This issue may happen when the workload has very few
>> foreign transactions.
>>
>> If new foreign transaction comes while the transaction resolver is processing
>> resolutions via FdwXactResolverProcessInDoubtXacts(), the foreign transaction
>> waits until starting next transaction resolution. If next foreign transaction
>> doesn't come, the foreign transaction must wait starting resolution until
>> timeout. I mentioned this situation.
>
> Thanks for your explanation. I think that in this case we should set
> the latch of the resolver after preparing all foreign transactions so
> that the resolver process those transactions without sleep.

Yes, your idea is much better. Thanks!

>>
>> Thanks for letting me know the side effect if setting resolution timeout to
>> several milliseconds. I agree. But, why termination is needed? Is there a
>> possibility to stale like walsender?
>
> The purpose of this timeout is to terminate resolvers that are idle
> for a long time. The resolver processes don't necessarily need to keep
> running all the time for every database. On the other hand, launching
> a resolver process per commit would be a high cost. So we have
> resolver processes keep running at least for
> foreign_transaction_resolver_timeout.
Understood. I think it's reasonable.

>>>>
>>>>
>>>> b. about performance bottleneck (just share my simple benchmark results)
>>>>
>>>> The resolver process can be performance bottleneck easily although I think
>>>> some users want this feature even if the performance is not so good.
>>>>
>>>> I tested with very simple workload in my laptop.
>>>>
>>>> The test condition is
>>>> * two remote foreign partitions and one transaction inserts an entry in each
>>>> partitions.
>>>> * local connection only. If NW latency became higher, the performance became
>>>> worse.
>>>> * pgbench with 8 clients.
>>>>
>>>> The test results is the following. The performance of 2PC is only 10%
>>>> performance of the one of without 2PC.
>>>>
>>>> * with foreign_twophase_commit = requried
>>>> -> If load with more than 10TPS, the number of unresolved foreign transactions
>>>> is increasing and stop with the warning "Increase
>>>> max_prepared_foreign_transactions".
>>>
>>> What was the value of max_prepared_foreign_transactions?
>>
>> Now, I tested with 200.
>>
>> If each resolution is finished very soon, I thought it's enough because
>> 8clients x 2partitions = 16, though... But, it's difficult how to know the
>> stable values.
>
> During resolving one distributed transaction, the resolver needs both
> one round trip and fsync-ing WAL record for each foreign transaction.
> Since the client doesn’t wait for the distributed transaction to be
> resolved, the resolver process can be easily bottle-neck given there
> are 8 clients.
>
> If foreign transaction resolution was resolved synchronously, 16 would suffice.

OK, thanks.

>>
>>
>>> To speed up the foreign transaction resolution, some ideas have been
>>> discussed. As another idea, how about launching resolvers for each
>>> foreign server? That way, we resolve foreign transactions on each
>>> foreign server in parallel. If foreign transactions are concentrated
>>> on the particular server, we can have multiple resolvers for the one
>>> foreign server. It doesn’t change the fact that all foreign
>>> transaction resolutions are processed by resolver processes.
>>
>> Awesome! There seems to be another pros that even if a foreign server is
>> temporarily busy or stopped due to fail over, other foreign server's
>> transactions can be resolved.
>
> Yes. We also might need to be careful about the order of foreign
> transaction resolution. I think we need to resolve foreign> transactions in arrival order at least within a foreign server.

I agree it's better.

(Although this is my interest...)
Is it necessary? Although this idea seems to be for atomic visibility,
2PC can't realize that as you know. So, I wondered that.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2021-05-21 08:52:42 Re: Additional Chapter for Tutorial
Previous Message Peter Smith 2021-05-21 08:43:07 Re: [HACKERS] logical decoding of two-phase transactions