Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, "Wei Wang (Fujitsu)" <wangw(dot)fnst(at)fujitsu(dot)com>, "Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date: 2023-07-15 11:18:14
Message-ID: CAA4eK1+iv9P6RAe4quFp3VXrLjKW9=mLV7p8jfu2GOMNcy+HzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 14, 2023 at 3:07 PM Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com> wrote:
>
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, 14 Tem 2023 Cum, 11:11 tarihinde şunu yazdı:
>>
>> Yeah, it is quite surprising that Design#2 is worse than master. I
>> suspect there is something wrong going on with your Design#2 patch.
>> One area to check is whether apply worker is able to quickly assign
>> the new relations to tablesync workers. Note that currently after the
>> first time assigning the tables to workers, the apply worker may wait
>> before processing the next set of tables in the main loop of
>> LogicalRepApplyLoop(). The other minor point about design#2
>> implementation is that you may want to first assign the allocated
>> tablesync workers before trying to launch a new worker.
>
>
> It's not actually worse than master all the time. It seems like it's just unreliable.
> Here are some consecutive runs for both designs and master.
>
> design#1 = 1621,527 ms, 1788,533 ms, 1645,618 ms, 1702,068 ms, 1745,753 ms
> design#2 = 2089,077 ms, 1864,571 ms, 4574,799 ms, 5422,217 ms, 1905,944 ms
> master = 2815,138 ms, 2481,954 ms , 2594,413 ms, 2620,690 ms, 2489,323 ms
>
> And apply worker was not busy with applying anything during these experiments since there were not any writes to the publisher. I'm not sure how that would also affect the performance if there were any writes.
>

Yeah, this is a valid point. I think this is in favor of the Design#1
approach we are discussing here. One thing I was thinking whether we
can do anything to alleviate the contention at the higher worker
count. One possibility is to have some kind of available worker list
which can be used to pick up the next worker instead of checking all
the workers while assigning the next table. We can probably explore it
separately once the first three patches are ready because anyway, this
will be an optimization atop the Design#1 approach.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2023-07-15 13:45:40 Re: logicalrep_message_type throws an error
Previous Message jian he 2023-07-15 09:04:07 Re: SQL:2011 application time