Re: Perform streaming logical transactions by background workers and parallel apply

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-11-18 05:00:59
Message-ID: CAD21AoDNOVGghV3HdqcEA_otdXPqEzHykO-sZJbBkTisRgoTZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 18, 2022 at 1:47 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Nov 18, 2022 at 8:01 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > On Fri, Nov 18, 2022 at 11:36 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > ...
> > > ---
> > > The streaming parameter has the new value "parallel" for "streaming"
> > > option to enable the parallel apply. It fits so far but I think the
> > > parallel apply feature doesn't necessarily need to be tied up with
> > > streaming replication. For example, we might want to support parallel
> > > apply also for non-streaming transactions in the future. It might be
> > > better to have another option, say "parallel", to control parallel
> > > apply behavior. The "parallel" option can be a boolean option and
> > > setting parallel = on requires streaming = on.
> > >
>
> If we do that then how will the user be able to use streaming
> serialize mode (write to file for streaming transactions) as we have
> now? Because after we introduce parallelism for non-streaming
> transactions, the user would want parallel = on irrespective of the
> streaming mode. Also, users may wish to only parallelize large
> transactions because of additional overhead for non-streaming
> transactions for transaction dependency tracking, etc. So, the user
> may wish to have a separate knob for large transactions as the patch
> has now.

One idea for that would be to make it enum. For example, setting
parallel = "streaming" works for that.

>
> >
> > FWIW, I tend to agree with this idea but for a different reason. In
> > this patch, the 'streaming' parameter had become a kind of hybrid
> > boolean/enum. AFAIK there are no other parameters anywhere that use a
> > hybrid pattern like this so I was thinking it may be better not to be
> > different.
> >
>
> I think we have a similar pattern for GUC parameters like
> constraint_exclusion (see constraint_exclusion_options),
> backslash_quote (see backslash_quote_options), etc.

Right. vacuum_index_cleanup and buffering storage parameters that
accept 'on', 'off', or 'auto') are other examples.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2022-11-18 05:01:12 Avoid double lookup in pgstat_fetch_stat_tabentry()
Previous Message Amit Kapila 2022-11-18 04:47:10 Re: Perform streaming logical transactions by background workers and parallel apply