Re: Perform streaming logical transactions by background workers and parallel apply

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-10-11 00:22:08
Message-ID: CAD21AoDm3224e=se7=ZYt=R+v0_ZJ4E9dd5y2816_rTTCV+G+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 7, 2022 at 2:00 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Oct 7, 2022 at 8:47 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Thu, Oct 6, 2022 at 9:04 PM houzj(dot)fnst(at)fujitsu(dot)com
> > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > I think the root reason for this kind of deadlock problems is the table
> > > structure difference between publisher and subscriber(similar to the unique
> > > difference reported earlier[1]). So, I think we'd better disallow this case. For
> > > example to avoid the reported problem, we could only support parallel apply if
> > > pubviaroot is false on publisher and replicated tables' types(relkind) are the
> > > same between publisher and subscriber.
> > >
> > > Although it might restrict some use cases, but I think it only restrict the
> > > cases when the partitioned table's structure is different between publisher and
> > > subscriber. User can still use parallel apply for cases when the table
> > > structure is the same between publisher and subscriber which seems acceptable
> > > to me. And we can also document that the feature is expected to be used for the
> > > case when tables' structure are the same. Thoughts ?
> >
> > I'm concerned that it could be a big restriction for users. Having
> > different partitioned table's structures on the publisher and the
> > subscriber is quite common use cases.
> >
> > From the feature perspective, the root cause seems to be the fact that
> > the apply worker does both receiving and applying changes. Since it
> > cannot receive the subsequent messages while waiting for a lock on a
> > table, the parallel apply worker also cannot move forward. If we have
> > a dedicated receiver process, it can off-load the messages to the
> > worker while another process waiting for a lock. So I think that
> > separating receiver and apply worker could be a building block for
> > parallel-apply.
> >
>
> I think the disadvantage that comes to mind is the overhead of passing
> messages between receiver and applier processes even for non-parallel
> cases. Now, I don't think it is advisable to have separate handling
> for non-parallel cases. The other thing is that we need to someway
> deal with feedback messages which helps to move synchronous replicas
> and update subscriber's progress which in turn helps to keep the
> restart point updated. These messages also act as heartbeat messages
> between walsender and walapply process.
>
> To deal with this, one idea is that we can have two connections to
> walsender process, one with walreceiver and the other with walapply
> process which according to me could lead to a big increase in resource
> consumption and it will bring another set of complexities in the
> system. Now, in this, I think we have two possibilities, (a) The first
> one is that we pass all messages to the leader apply worker and then
> it decides whether to execute serially or pass it to the parallel
> apply worker. However, that can again deadlock in the truncate
> scenario we discussed because the main apply worker won't be able to
> receive new messages once it is blocked at the truncate command. (b)
> The second one is walreceiver process itself takes care of passing
> streaming transactions to parallel apply workers but if we do that
> then walreceiver needs to wait at the transaction end to maintain
> commit order which means it can also lead to deadlock in case the
> truncate happens in a streaming xact.

I imagined (b) but I had missed the point of preserving the commit
order. Separating the receiver and apply worker cannot resolve this
problem.

>
> The other alternative is that we allow walreceiver process to wait for
> apply process to finish transaction and send the feedback but that
> seems to be again an overhead if we have to do it even for small
> transactions, especially it can delay sync replication cases. Even, if
> we don't consider overhead, it can still lead to a deadlock because
> walreceiver won't be able to move in the scenario we are discussing.
>
> About your point that having different partition structures for
> publisher and subscriber, I don't know how common it will be once we
> have DDL replication. Also, the default value of
> publish_via_partition_root is false which doesn't seem to indicate
> that this is a quite common case.

So how can we consider these concurrent issues that could happen only
when streaming = 'parallel'? Can we restrict some use cases to avoid
the problem or can we have a safeguard against these conflicts? We
could find a new problematic scenario in the future and if it happens,
logical replication gets stuck, it cannot be resolved only by apply
workers themselves.

Regards,

--
Masahiko Sawada
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2022-10-11 00:37:46 Re: autovacuum_freeze_max_age reloption seems broken
Previous Message David Rowley 2022-10-11 00:16:50 Re: shadow variables - pg15 edition