Re: Perform streaming logical transactions by background workers and parallel apply

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-08-02 12:05:22
Message-ID: CAA4eK1LSEskMhz6noNbsyMYOrcXOOQ5tFVNpy3Ck2so-s4Q6NA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 2, 2022 at 5:16 PM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wednesday, July 27, 2022 4:22 PM houzj(dot)fnst(at)fujitsu(dot)com wrote:
> >
> > On Tuesday, July 26, 2022 5:34 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com>
> > wrote:
> >
> > > 3.
> > > Why are we restricting parallel apply workers only for the streamed
> > > transactions, because streaming depends upon the size of the logical
> > > decoding work mem so making steaming and parallel apply tightly
> > > coupled seems too restrictive to me. Do we see some obvious problems
> > > in applying other transactions in parallel?
> >
> > We thought there could be some conflict failure and deadlock if we parallel
> > apply normal transaction which need transaction dependency check[1]. But I
> > will do some more research for this and share the result soon.
>
> After thinking about this, I confirmed that it would be easy to cause deadlock
> error if we don't have additional dependency analysis and COMMIT order preserve
> handling for parallel apply normal transaction.
>
> Because the basic idea to parallel apply normal transaction in the first
> version is that: the main apply worker will receive data from pub and pass them
> to apply bgworker without applying by itself. And only before the apply
> bgworker apply the final COMMIT command, it need to wait for any previous
> transaction to finish to preserve the commit order. It means we could pass the
> next transaction's data to another apply bgworker before the previous
> transaction is committed in the first apply bgworker.
>
> In this approach, we have to do the dependency analysis because it's easy to
> cause dead lock error when applying DMLs in parallel(See the attachment for the
> examples where the dead lock could happen). So, it's a bit different from
> streaming transaction.
>
> We could apply the next transaction only after the first transaction is
> committed in which approach we don't need the dependency analysis, but it would
> not bring noticeable performance improvement even if we start serval apply
> workers to do that because the actual DMLs are not performed in parallel.
>

I agree that for short transactions it may not bring noticeable
performance improvement but somewhat larger transactions could still
benefit from parallelism where we won't start to operate on new
transactions without waiting for the previous transaction's commit.
Having said that, I think we can enable parallelism for non-streaming
transactions as a separate patch.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2022-08-02 12:49:47 Re: Typo in "43.9.1. Reporting Errors and Messages"?
Previous Message Dong Wook Lee 2022-08-02 12:03:58 Re: add test: pg_rowlocks extension