Re: Perform streaming logical transactions by background workers and parallel apply

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-08-08 04:48:44
Message-ID: CAFiTN-s-mOXbzvOnQOV3KU_=+m3bPb8K3k22SkeDNKbozTaEbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 2, 2022 at 5:16 PM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wednesday, July 27, 2022 4:22 PM houzj(dot)fnst(at)fujitsu(dot)com wrote:
> >
> > On Tuesday, July 26, 2022 5:34 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com>
> > wrote:
> >
> > > 3.
> > > Why are we restricting parallel apply workers only for the streamed
> > > transactions, because streaming depends upon the size of the logical
> > > decoding work mem so making steaming and parallel apply tightly
> > > coupled seems too restrictive to me. Do we see some obvious problems
> > > in applying other transactions in parallel?
> >
> > We thought there could be some conflict failure and deadlock if we parallel
> > apply normal transaction which need transaction dependency check[1]. But I
> > will do some more research for this and share the result soon.
>
> After thinking about this, I confirmed that it would be easy to cause deadlock
> error if we don't have additional dependency analysis and COMMIT order preserve
> handling for parallel apply normal transaction.
>
> Because the basic idea to parallel apply normal transaction in the first
> version is that: the main apply worker will receive data from pub and pass them
> to apply bgworker without applying by itself. And only before the apply
> bgworker apply the final COMMIT command, it need to wait for any previous
> transaction to finish to preserve the commit order. It means we could pass the
> next transaction's data to another apply bgworker before the previous
> transaction is committed in the first apply bgworker.
>
> In this approach, we have to do the dependency analysis because it's easy to
> cause dead lock error when applying DMLs in parallel(See the attachment for the
> examples where the dead lock could happen). So, it's a bit different from
> streaming transaction.
>
> We could apply the next transaction only after the first transaction is
> committed in which approach we don't need the dependency analysis, but it would
> not bring noticeable performance improvement even if we start serval apply
> workers to do that because the actual DMLs are not performed in parallel.
>
> Based on above, we plan to first introduce the patch to perform streaming
> logical transactions by background workers, and then introduce parallel apply
> normal transaction which design is different and need some additional handling.

Yeah I think that makes sense. Since the streamed transactions are
sent to standby interleaved so we can take advantage of parallelism
and along with that we can also avoid the I/O so that will also
speedup.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2022-08-08 04:51:24 Re: Handle infinite recursion in logical replication setup
Previous Message Bharath Rupireddy 2022-08-08 04:29:09 Re: Generalize ereport_startup_progress infrastructure