RE: Perform streaming logical transactions by background workers and parallel apply

From: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-05-25 02:24:59
Message-ID: OS3PR01MB627532B048F71FFF8FBDB6A99ED69@OS3PR01MB6275.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 13, 2022 4:53 PM houzj(dot)fnst(at)fujitsu(dot)com wrote:
> On Wednesday, May 11, 2022 1:10 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> >
> > On Wed, May 11, 2022 at 9:35 AM Masahiko Sawada
> > <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Tue, May 10, 2022 at 5:59 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > wrote:
> > > >
> > > > On Tue, May 10, 2022 at 10:39 AM Masahiko Sawada
> > <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > >
> > > > > Having it optional seems a good idea. BTW can the user configure
> > > > > how many apply bgworkers can be used per subscription or in the
> > > > > whole system? Like max_sync_workers_per_subscription, is it better
> > > > > to have a configuration parameter or a subscription option for
> > > > > that? If so, setting it to 0 probably means to disable the parallel apply
> > feature.
> > > > >
> > > >
> > > > Yeah, that might be useful but we are already giving an option while
> > > > creating a subscription whether to allow parallelism, so will it be
> > > > useful to give one more way to disable this feature? OTOH, having
> > > > something like max_parallel_apply_workers/max_bg_apply_workers at
> > > > the system level can give better control for how much parallelism
> > > > the user wishes to allow for apply work.
> > >
> > > Or we can have something like
> > > max_parallel_apply_workers_per_subscription that controls how many
> > > parallel apply workers can launch per subscription. That also gives
> > > better control for the number of parallel apply workers.
> > >
> >
> > I think we can go either way in this matter as both have their pros and cons. I
> > feel limiting the parallel workers per subscription gives better control but
> > OTOH, it may not allow max usage of parallelism because some quota from
> > other subscriptions might remain unused. Let us see what Hou-San or others
> > think on this matter?
>
> Thanks for Amit and Sawada-san's comments !
> I will think over these approaches and reply soon.
After reading the thread, I wrote two patches for these comments.

The first patch (see v6-0003):
Improve the feature as suggested in [1].
For the issue mentioned by Amit-san (there is a block problem in the case
mentioned by Sawada-san), after investigating, I think this issue is caused by
unique index. So I added a check to make sure the unique columns are the same
between publisher and subscriber.
For other cases, I added the check that if there is any non-immutable function
present in expression in subscriber's relation. Check from the following 3
items:
a. The function in triggers;
b. Column default value expressions and domain constraints;
c. Constraint expressions.
BTW, I do not add partitioned table related code. I think this part needs other
additional modifications. I will add this later when these modifications are
finished.

The second patch (see v6-0004):
Improve the feature as suggested in [2].
Add a GUC "max_apply_bgworkers_per_subscription" to control parallelism. This
GUC controls how many apply background workers can be launched per
subscription. I set its default value to 3 and do not change the default value
of other GUCs.

[1] - https://www.postgresql.org/message-id/CAA4eK1JwahU_WuP3S%2B7POqta%3DPhm_3gxZeVmJuuoUq1NV%3DkrXA%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAA4eK1%2B7D4qAQUQEE8zzQ0fGCqeBWd3rzTaY5N0jVs-VXFc_Xw%40mail.gmail.com

Attach the patches. (Did not change v6-0001 and v6-0002.)

Regards,
Wang wei

Attachment Content-Type Size
v6-0001-Perform-streaming-logical-transactions-by-backgro.patch application/octet-stream 74.3 KB
v6-0002-Test-streaming-apply-option-in-tap-test.patch application/octet-stream 64.8 KB
v6-0003-Add-some-checks-before-using-apply-background-wor.patch application/octet-stream 17.7 KB
v6-0004-Add-a-GUC-max_apply_bgworkers_per_subscription-to.patch application/octet-stream 6.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message osumi.takamichi@fujitsu.com 2022-05-25 02:28:01 RE: Build-farm - intermittent error in 031_column_list.pl
Previous Message Kyotaro Horiguchi 2022-05-25 02:07:52 Re: Add --{no-,}bypassrls flags to createuser