Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

From: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date: 2023-06-01 11:22:30
Message-ID: CAGPVpCSD64O-HmADVHPx3dgAf4FM-erx9UT4NKdizBOxKSedFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Peter,

Peter Smith <smithpb2250(at)gmail(dot)com>, 26 May 2023 Cum, 10:30 tarihinde
şunu yazdı:
>
> On Thu, May 25, 2023 at 6:59 PM Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com> wrote:
> Yes, I was mostly referring to the same as point 1 below about patch
> 0001. I guess I just found the concept of mixing A) launching TSW (via
> apply worker) with B) reassigning TSW to another relation (by the TSW
> battling with its peers) to be a bit difficult to understand. I
> thought most of the refactoring seemed to arise from choosing to do it
> that way.

No, the refactoring is not related to the way of assigning a new
table. In fact, the patch did not include such refactoring a couple
versions earlier [1] and was still assigning tables the same way. It
was suggested here [2]. Then, I made the patch 0001 which includes
some refactoring and only reuses the worker and nothing else. Also I
find it more understandable this way, maybe it's a bit subjective.

I feel that logical replication related files are getting more and
more complex and hard to understand with each change. IMHO, even
without reusing anything, those need some refactoring anyway. But for
this patch, refactoring some places made it simpler to reuse workers
and/or replication slots, regardless of how tables are assigned to
TSW's.

> +1. I think it would be nice to see POC of both ways for benchmark
> comparison because IMO performance is not the only consideration --
> unless there is an obvious winner, then they need to be judged also by
> the complexity of the logic, the amount of code that needed to be
> refactored, etc.

Will try to do that. But, like I mentioned above, I don't think that
such a change would reduce the complexity or number of lines changed.

> But it is difficult to get an overall picture of the behaviour. Mostly
> when benchmarks were posted you hold one variable fixed and show only
> one other varying. It always leaves me wondering -- what about not
> empty tables, or what about different numbers of tables etc. Is it
> possible to make some script to gather a bigger set of results so we
> can see everything at once? Perhaps then it will become clear there is
> some "sweet spot" where the patch is really good but beyond that it
> degrades (actually, who knows what it might show).

I actually shared the benchmarks with different numbers of tables and
sizes. But those were all with 2 workers. I guess you want a similar
benchmark with different numbers of workers.
Will work on this and share soon.

[1] https://www.postgresql.org/message-id/CAGPVpCQmEE8BygXr%3DHi2N2t2kOE%3DPJwofn9TX0J9J4crjoXarQ%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAAKRu_YKGyF%2BsvRQqe1th-mG9xLdzneWgh9H1z1DtypBkawkkw%40mail.gmail.com

Thanks,
--
Melih Mutlu
Microsoft

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dagfinn Ilmari Mannsåker 2023-06-01 12:12:22 [PATCH] Using named captures in Catalog::ParseHeader()
Previous Message Melih Mutlu 2023-06-01 10:54:02 Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication