Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date: 2022-07-06 08:17:29
Message-ID: CAFiTN-tff+RVLwkx0DkM77YZniohoq9gJ5AByU-0B=KUCmUyjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 6, 2022 at 9:06 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> How would you choose the slot name for the table sync, right now it
> contains the relid of the table for which it needs to perform sync?
> Say, if we ignore to include the appropriate identifier in the slot
> name, we won't be able to resue/drop the slot after restart of table
> sync worker due to an error.

I had a quick look into the patch and it seems it is using the worker
array index instead of relid while forming the slot name, and I think
that make sense, because now whichever worker is using that worker
index can reuse the slot created w.r.t that index.

> >
> > With those changes, I did some benchmarking to see if it improves anything.
> > This results compares this patch with the latest version of master branch. "max_sync_workers_per_subscription" is set to 2 as default.
> > Got some results simply averaging timings from 5 consecutive runs for each branch.
> >
> > First, tested logical replication with empty tables.
> > 10 tables
> > ----------------
> > - master: 286.964 ms
> > - the patch: 116.852 ms
> >
> > 100 tables
> > ----------------
> > - master: 2785.328 ms
> > - the patch: 706.817 ms
> >
> > 10K tables
> > ----------------
> > - master: 39612.349 ms
> > - the patch: 12526.981 ms
> >
> >
> > Also tried replication tables with some data
> > 10 tables loaded with 10MB data
> > ----------------
> > - master: 1517.714 ms
> > - the patch: 1399.965 ms
> >
> > 100 tables loaded with 10MB data
> > ----------------
> > - master: 16327.229 ms
> > - the patch: 11963.696 ms
> >
> >
> > Then loaded more data
> > 10 tables loaded with 100MB data
> > ----------------
> > - master: 13910.189 ms
> > - the patch: 14770.982 ms
> >
> > 100 tables loaded with 100MB data
> > ----------------
> > - master: 146281.457 ms
> > - the patch: 156957.512
> >
> >
> > If tables are mostly empty, the improvement can be significant - up to 3x faster logical replication.
> > With some data loaded, it can still be faster to some extent.
> >
>
> These results indicate that it is a good idea, especially for very small tables.
>
> > When the table size increases more, the advantage of reusing workers becomes insignificant.
> >
>
> It seems from your results that performance degrades for large
> relations. Did you try to investigate the reasons for the same?

Yeah, that would be interesting to know that why there is a drop in some cases.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2022-07-06 08:18:08 Re: Patch proposal: New hooks in the connection path
Previous Message Drouvot, Bertrand 2022-07-06 08:13:51 Re: Patch proposal: New hooks in the connection path