Re: Perform streaming logical transactions by background workers and parallel apply

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2023-01-19 10:22:36
Message-ID: CAJpy0uBVjmzNn10VFsP6RpAKHHaGa0EL-AEdaF6o7ocaONthKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 19, 2023 at 3:44 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Thu, Jan 19, 2023 at 11:11 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Wed, Jan 18, 2023 at 12:09 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Fri, Jan 13, 2023 at 11:50 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > > >
> > > > Here are some review comments for patch v79-0002.
> > > >
> > >
> > > So, this is about the latest v84-0001-Stop-extra-worker-if-GUC-was-changed.
> > >
> > > >
> > > > I feel this patch just adds more complexity for almost no gain:
> > > > - reducing the 'max_apply_workers_per_suibscription' seems not very
> > > > common in the first place.
> > > > - even when the GUC is reduced, at that point in time all the workers
> > > > might be in use so there may be nothing that can be immediately done.
> > > > - IIUC the excess workers (for a reduced GUC) are going to get freed
> > > > naturally anyway over time as more transactions are completed so the
> > > > pool size will reduce accordingly.
> > > >
> > >
> > > I am still not sure if it is worth pursuing this patch because of the
> > > above reasons. I don't think it would be difficult to add this even at
> > > a later point in time if we really see a use case for this.
> > > Sawada-San, IIRC, you raised this point. What do you think?
> > >
> > > The other point I am wondering is whether we can have a different way
> > > to test partial serialization apart from introducing another developer
> > > GUC (stream_serialize_threshold). One possibility could be that we can
> > > have a subscription option (parallel_send_timeout or something like
> > > that) with some default value (current_timeout used in the patch)
> > > which will be used only when streaming = parallel. Users may want to
> > > wait for more time before serialization starts depending on the
> > > workload (say when resource usage is high on a subscriber-side
> > > machine, or there are concurrent long-running transactions that can
> > > block parallel apply for a bit longer time). I know with this as well
> > > it may not be straightforward to test the functionality because we
> > > can't be sure how many changes would be required for a timeout to
> > > occur. This is just for brainstorming other options to test the
> > > partial serialization functionality.
> > >
> >
> > Apart from the above, we can also have a subscription option to
> > specify parallel_shm_queue_size (queue_size used to determine the
> > queue between the leader and parallel worker) and that can be used for
> > this purpose. Basically, configuring it to a smaller value can help in
> > reducing the test time but still, it will not eliminate the need for
> > dependency on timing we have to wait before switching to partial
> > serialize mode. I think this can be used in production as well to tune
> > the performance depending on workload.
> >
> > Yet another way is to use the existing parameter logical_decode_mode
> > [1]. If the value of logical_decoding_mode is 'immediate', then we can
> > immediately switch to partial serialize mode. This will eliminate the
> > dependency on timing. The one argument against using this is that it
> > won't be as clear as a separate parameter like
> > 'stream_serialize_threshold' proposed by the patch but OTOH we already
> > have a few parameters that serve a different purpose when used on the
> > subscriber. For example, 'max_replication_slots' is used to define the
> > maximum number of replication slots on the publisher and the maximum
> > number of origins on subscribers. Similarly,
> > wal_retrieve_retry_interval' is used for different purposes on
> > subscriber and standby nodes.
> >
> > [1] - https://www.postgresql.org/docs/devel/runtime-config-developer.html
> >
> > --
> > With Regards,
> > Amit Kapila.
>
> Hi Amit,
>
> On rethinking the complete model, what I feel is that the name
> logical_decoding_mode is not something which defines modes of logical
> decoding. We, I think, picked it based on logical_decoding_work_mem.
> As per current implementation, the parameter 'logical_decoding_mode'
> tells what happens when work-memory used by logical decoding reaches
> its limit. So it is in-fact 'logicalrep_workmem_vacate_mode' or

Minor correction in what I said earlier:
As per current implementation, the parameter 'logical_decoding_mode'
more or less tells how to deal with workmem i.e. to keep it buffering
with txns until it reaches its limit or immediately vacate it.

> 'logicalrep_trans_eviction_mode'. So if it is set to immediate,
> meaning vacate the work-memory immediately or evict transactions
> immediately. Add buffered means the reverse (i.e. keep on buffering
> transactions until we reach a limit). Now coming to subscribers, we
> can reuse the same parameter. On subscriber as well, shared-memory
> queue could be considered as its workmem and thus the name
> 'logicalrep_workmem_vacate_mode' might look more relevant.
>
> On publisher:
> logicalrep_workmem_vacate_mode=immediate, buffered.
>
> On subscriber:
> logicalrep_workmem_vacate_mode=stream_serialize (or if we want to
> keep immediate here too, that will also be fine).
>
> Thoughts?
> And I am assuming it is possible to change the GUC name before the
> coming release. If not, please let me know, we can brainstorm other
> ideas.
>
> thanks
> Shveta

thanks
Shveta

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2023-01-19 10:43:23 Re: Logical replication timeout problem
Previous Message Alvaro Herrera 2023-01-19 10:16:29 Re: Rethinking the implementation of ts_headline()