Re: Perform streaming logical transactions by background workers and parallel apply

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-05-11 04:54:26
Message-ID: CAA4eK1K7btvyHzy6ps=gtMi8y+PQQRt0prPXvF-djTb7aHtkiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 11, 2022 at 9:17 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, May 10, 2022 at 6:10 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Tue, May 10, 2022 at 10:35 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Wed, May 4, 2022 at 12:50 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > >
> > > > I think the other kind of problem that can happen here is delete
> > > > followed by an insert. If in the example provided by you, TX-1
> > > > performs delete (say it is large enough to cause streaming) and TX-2
> > > > performs insert then I think it will block the apply worker because
> > > > insert will start waiting infinitely. Currently, I think it will lead
> > > > to conflict due to insert but that is still solvable by allowing users
> > > > to remove conflicting rows.
> > > >
> > > > It seems both these problems are due to the reason that the table on
> > > > publisher and subscriber has different constraints otherwise, we would
> > > > have seen the same behavior on the publisher as well.
> > > >
> > > > There could be a few ways to avoid these and similar problems:
> > > > a. detect the difference in constraints between publisher and
> > > > subscribers like primary key and probably others (like whether there
> > > > is any volatile function present in index expression) when applying
> > > > the change and then we give ERROR to the user that she must change the
> > > > streaming mode to 'spill' instead of 'apply' (aka parallel apply).
> > > > b. Same as (a) but instead of ERROR just LOG this information and
> > > > change the mode to spill for the transactions that operate on that
> > > > particular relation.
> > >
> > > Given that it doesn't introduce a new kind of problem I don't think we
> > > need special treatment for that at least in this feature.
> > >
> >
> > Isn't the problem related to infinite wait by insert as explained in
> > my previous email (in the above-quoted text) a new kind of problem
> > that won't exist in the current implementation?
> >
>
> Sorry I had completely missed the point that the commit order won't be
> changed. I agree that this new implementation would introduce a new
> kind of issue as you mentioned above, and the opposite is not true.
>
> Regarding the case you explained in the previous email I also think it
> will happen with the parallel apply feature. The apply worker will be
> blocked until the conflict is resolved. I'm not sure how to avoid
> that. It would be not easy to compare constraints between publisher
> and subscribers when replicating partitioning tables.
>

I agree that partitioned tables need some more thought but in some
simple cases where replication happens via individual partition tables
(default), we can detect as we do for normal tables. OTOH, when
replication happens via root (publish_via_partition_root) it could be
tricky as the partitions could be different on both sides. I think the
cases where we can't safely identify the constraint difference won't
be considered for apply via a new bg worker.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-05-11 05:10:27 Re: Perform streaming logical transactions by background workers and parallel apply
Previous Message Masahiko Sawada 2022-05-11 04:45:24 Re: Support logical replication of DDLs