Re: Perform streaming logical transactions by background workers and parallel apply

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2023-01-18 06:39:26
Message-ID: CAA4eK1LotEuPsteuJMNpixxTj6R4B8k93q-6ruRmDzCxKzMNpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 13, 2023 at 11:50 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are some review comments for patch v79-0002.
>

So, this is about the latest v84-0001-Stop-extra-worker-if-GUC-was-changed.

> ======
>
> General
>
> 1.
>
> I saw that earlier in this thread Hou-san [1] and Amit [2] also seemed
> to say there is not much point for this patch.
>
> So I wanted to +1 that same opinion.
>
> I feel this patch just adds more complexity for almost no gain:
> - reducing the 'max_apply_workers_per_suibscription' seems not very
> common in the first place.
> - even when the GUC is reduced, at that point in time all the workers
> might be in use so there may be nothing that can be immediately done.
> - IIUC the excess workers (for a reduced GUC) are going to get freed
> naturally anyway over time as more transactions are completed so the
> pool size will reduce accordingly.
>

I am still not sure if it is worth pursuing this patch because of the
above reasons. I don't think it would be difficult to add this even at
a later point in time if we really see a use case for this.
Sawada-San, IIRC, you raised this point. What do you think?

The other point I am wondering is whether we can have a different way
to test partial serialization apart from introducing another developer
GUC (stream_serialize_threshold). One possibility could be that we can
have a subscription option (parallel_send_timeout or something like
that) with some default value (current_timeout used in the patch)
which will be used only when streaming = parallel. Users may want to
wait for more time before serialization starts depending on the
workload (say when resource usage is high on a subscriber-side
machine, or there are concurrent long-running transactions that can
block parallel apply for a bit longer time). I know with this as well
it may not be straightforward to test the functionality because we
can't be sure how many changes would be required for a timeout to
occur. This is just for brainstorming other options to test the
partial serialization functionality.

Thoughts?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Takamichi Osumi (Fujitsu) 2023-01-18 07:00:43 Modify the document of Logical Replication configuration settings
Previous Message Takamichi Osumi (Fujitsu) 2023-01-18 06:11:02 RE: Time delayed LR (WAS Re: logical replication restrictions)