RE: Perform streaming logical transactions by background workers and parallel apply

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Alexander Lakhin <exclusion(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Date: 2023-05-02 03:35:58
Message-ID: OS0PR01MB5716E3C106DF49B3056EF72C946F9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, April 28, 2023 2:18 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Fri, Apr 28, 2023 at 11:51 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Wed, Apr 26, 2023 at 4:11 PM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > On Wednesday, April 26, 2023 5:00 PM Alexander Lakhin
> <exclusion(at)gmail(dot)com> wrote:
> > > >
> > > > IIUC, that assert will fail in case of any error raised between
> > > >
> ApplyWorkerMain()->logicalrep_worker_attach()->before_shmem_exit() and
> > > >
> ApplyWorkerMain()->InitializeApplyWorker()->BackgroundWorkerInitializeC
> > > > onnectionByOid()->InitPostgres().
> > >
> > > Thanks for reporting the issue.
> > >
> > > I think the problem is that it tried to release locks in
> > > logicalrep_worker_onexit() before the initialization of the process is
> complete
> > > because this callback function was registered before the init phase. So I
> think we
> > > can add a conditional statement before releasing locks. Please find an
> attached
> > > patch.
> > >
> >
> > Alexander, does the proposed patch fix the problem you are facing?
> > Sawada-San, and others, do you see any better way to fix it than what
> > has been proposed?
>
> I'm concerned that the idea of relying on IsNormalProcessingMode()
> might not be robust since if we change the meaning of
> IsNormalProcessingMode() some day it would silently break again. So I
> prefer using something like InitializingApplyWorker, or another idea
> would be to do cleanup work (e.g., fileset deletion and lock release)
> in a separate callback that is registered after connecting to the
> database.

Thanks for the review. I agree that it’s better to use a new variable here.
Attach the patch for the same.

>
> FWIW, we might need to be careful about the timing when we call
> logicalrep_worker_detach() in the worker's termination process. Since
> we rely on IsLogicalParallelApplyWorker() for the parallel apply
> worker to send ERROR messages to the leader apply worker, if an ERROR
> happens after logicalrep_worker_detach(), we will end up with the
> assertion failure.
>
> if (IsLogicalParallelApplyWorker())
> SendProcSignal(pq_mq_parallel_leader_pid,
> PROCSIG_PARALLEL_APPLY_MESSAGE,
> pq_mq_parallel_leader_backend_id);
> else
> {
> Assert(IsParallelWorker());
>
> It normally would be a should-no-happen case, though.

Yes, I think currently PA sends ERROR message before exiting,
so the callback functions are always fired after the above code which
looks fine to me.

Best Regards,
Hou zj

Attachment Content-Type Size
v2-0001-Fix-assert-failure-in-logical-replication-apply-w.patch application/octet-stream 4.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-05-02 04:16:21 Re: Perform streaming logical transactions by background workers and parallel apply
Previous Message Pavel Stehule 2023-05-02 03:27:38 Re: Large files for relations