Re: Perform streaming logical transactions by background workers and parallel apply

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2023-01-19 05:40:46
Message-ID: CAA4eK1JdDrFxcw20YHifFrfX1o_e=9sXpNMfNC5vMYvYcebwNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 18, 2023 at 12:09 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Jan 13, 2023 at 11:50 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Here are some review comments for patch v79-0002.
> >
>
> So, this is about the latest v84-0001-Stop-extra-worker-if-GUC-was-changed.
>
> >
> > I feel this patch just adds more complexity for almost no gain:
> > - reducing the 'max_apply_workers_per_suibscription' seems not very
> > common in the first place.
> > - even when the GUC is reduced, at that point in time all the workers
> > might be in use so there may be nothing that can be immediately done.
> > - IIUC the excess workers (for a reduced GUC) are going to get freed
> > naturally anyway over time as more transactions are completed so the
> > pool size will reduce accordingly.
> >
>
> I am still not sure if it is worth pursuing this patch because of the
> above reasons. I don't think it would be difficult to add this even at
> a later point in time if we really see a use case for this.
> Sawada-San, IIRC, you raised this point. What do you think?
>
> The other point I am wondering is whether we can have a different way
> to test partial serialization apart from introducing another developer
> GUC (stream_serialize_threshold). One possibility could be that we can
> have a subscription option (parallel_send_timeout or something like
> that) with some default value (current_timeout used in the patch)
> which will be used only when streaming = parallel. Users may want to
> wait for more time before serialization starts depending on the
> workload (say when resource usage is high on a subscriber-side
> machine, or there are concurrent long-running transactions that can
> block parallel apply for a bit longer time). I know with this as well
> it may not be straightforward to test the functionality because we
> can't be sure how many changes would be required for a timeout to
> occur. This is just for brainstorming other options to test the
> partial serialization functionality.
>

Apart from the above, we can also have a subscription option to
specify parallel_shm_queue_size (queue_size used to determine the
queue between the leader and parallel worker) and that can be used for
this purpose. Basically, configuring it to a smaller value can help in
reducing the test time but still, it will not eliminate the need for
dependency on timing we have to wait before switching to partial
serialize mode. I think this can be used in production as well to tune
the performance depending on workload.

Yet another way is to use the existing parameter logical_decode_mode
[1]. If the value of logical_decoding_mode is 'immediate', then we can
immediately switch to partial serialize mode. This will eliminate the
dependency on timing. The one argument against using this is that it
won't be as clear as a separate parameter like
'stream_serialize_threshold' proposed by the patch but OTOH we already
have a few parameters that serve a different purpose when used on the
subscriber. For example, 'max_replication_slots' is used to define the
maximum number of replication slots on the publisher and the maximum
number of origins on subscribers. Similarly,
wal_retrieve_retry_interval' is used for different purposes on
subscriber and standby nodes.

[1] - https://www.postgresql.org/docs/devel/runtime-config-developer.html

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2023-01-19 05:45:15 Re: Experiments with Postgres and SSL
Previous Message Andrey Lepikhov 2023-01-19 05:39:27 Re: [PATCH] random_normal function