Re: Perform streaming logical transactions by background workers and parallel apply

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-12-23 03:20:58
Message-ID: CAA4eK1LRUjgTsdGF-DbCeP6oq8yh6GzpCoA5cNqaJ_bx3hN-+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 22, 2022 at 6:18 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Thu, Dec 22, 2022 at 7:04 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Dec 22, 2022 at 11:39 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > Thank you for updating the patch. Here are some comments on v64 patches:
> > >
> > > While testing the patch, I realized that if all streamed transactions
> > > are handled by parallel workers, there is no chance for the leader to
> > > call maybe_reread_subscription() except for when waiting for the next
> > > message. Due to this, the leader didn't stop for a while even if the
> > > subscription gets disabled. It's an extreme case since my test was
> > > that pgbench runs 30 concurrent transactions and logical_decoding_mode
> > > = 'immediate', but we might want to make sure to call
> > > maybe_reread_subscription() at least after committing/preparing a
> > > transaction.
> > >
> >
> > Won't it be better to call it only if we handle the transaction by the
> > parallel worker?
>
> Agreed. And we won't need to do that after handling stream_prepare as
> we don't do that now.
>

I think we do this for both prepare and non-prepare cases via
begin_replication_step(). Here, in both cases, as the changes are sent
to the parallel apply worker, we missed in both cases. So, I think it
is better to do in both cases.

> >
> > It seems currently we give a similar message when the logical
> > replication worker slots are finished "out of logical replication
> > worker slots" or when we are not able to register background workers
> > "out of background worker slots". Now, OTOH, when we exceed the limit
> > of sync workers "max_sync_workers_per_subscription", we don't display
> > any message. Personally, I think if any user has used the streaming
> > option as "parallel" she wants all large transactions to be performed
> > in parallel and if the system is not able to deal with it, displaying
> > a LOG message will be useful for users. This is because the
> > performance difference for large transactions between parallel and
> > non-parallel is big (30-40%) and it is better for users to know as
> > soon as possible instead of expecting them to run some monitoring
> > query to notice the same.
>
> I see your point. But looking at other parallel features such as
> parallel queries, parallel vacuum and parallel index creation, we
> don't give such messages even if the number of parallel workers
> actually launched is lower than the ideal. They also bring a big
> performance benefit.
>

Fair enough. Let's remove this LOG message.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2022-12-23 03:22:08 Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)
Previous Message Vik Fearing 2022-12-23 02:46:39 Re: Allow WindowFuncs prosupport function to use more optimal WindowClause options