Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date: 2023-06-27 10:50:56
Message-ID: CAA4eK1LJ9HHsYTP_-K9Jazb+5wiz29APSDk7B08RJi=UPpvUHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 23, 2023 at 7:03 PM Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com> wrote:
>
> You can find the updated patchset attached.
> I worked to address the reviews and made some additional changes.
>
> Let me first explain the new patchset.
> 0001: Refactors the logical replication code, mostly worker.c and
> tablesync.c. Although this patch makes it easier to reuse workers, I
> believe that it's useful even by itself without other patches. It does
> not improve performance or anything but aims to increase readability
> and such.
> 0002: This is only to reuse worker processes, everything else stays
> the same (replication slots/origins etc.).
> 0003: Adds a new command for streaming replication protocol to create
> a snapshot by an existing replication slot.
> 0004: Reuses replication slots/origins together with workers.
>
> Even only 0001 and 0002 are enough to improve table sync performance
> at the rates previously shared on this thread. This also means that
> currently 0004 (reusing replication slots/origins) does not improve as
> much as I would expect, even though it does not harm either.
> I just wanted to share what I did so far, while I'm continuing to
> investigate it more to see what I'm missing in patch 0004.
>

I think the reason why you don't see the benefit of the 0004 patches
is that it still pays the cost of disconnect/connect and we haven't
saved much on network transfer costs because of the new snapshot you
are creating in patch 0003. Is it possible to avoid disconnect/connect
each time the patch needs to reuse the same tablesync worker? Once, we
do that and save the cost of drop_slot and associated network round
trip, you may see the benefit of 0003 and 0004 patches.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2023-06-27 11:13:41 Re: Infinite Interval
Previous Message vignesh C 2023-06-27 10:15:48 Re: Support logical replication of DDLs