Re: Handle infinite recursion in logical replication setup

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-08-02 11:59:36
Message-ID: CAA4eK1JX5-zv1JfX8SoHtfUgcjWVci69hVkbYr8CSB1XN64n6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 26, 2022 at 9:07 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Jul 26, 2022 at 7:13 AM Jonathan S. Katz <jkatz(at)postgresql(dot)org> wrote:
> >
> > Thanks for the example. I agree that it is fairly simple to reproduce.
> >
> > I understand that "copy_data = force" is meant to protect a user from
> > hurting themself. I'm not convinced that this is the best way to do so.
> >
> > For example today I can subscribe to multiple publications that write to
> > the same table. If I have a primary key on that table, and two of the
> > subscriptions try to write an identical ID, we conflict. We don't have
> > any special flags or modes to guard against that from happening, though
> > we do have documentation on conflicts and managing them.
> >
> > AFAICT the same issue with "copy_data" also exists in the above scenario
> > too, even without the "origin" attribute.
> >
>
> That's true but there is no parameter like origin = NONE which
> indicates that constraint violations or duplicate data problems won't
> occur due to replication. In the current case, I think the situation
> is different because a user has specifically asked not to replicate
> any remote data by specifying origin = NONE, which should be dealt
> differently. Note that current users or their setup won't see any
> difference/change unless they specify the new parameter origin as
> NONE.
>

Let me try to summarize the discussion so that it is easier for others
to follow. The work in this thread is to avoid loops, and duplicate
data in logical replication when the operations happened on the same
table in multiple nodes. It has been explained in email [1] with an
example of how a logical replication setup can lead to duplicate or
inconsistent data.

The idea to solve this problem is that we don't replicate data that is
not generated locally which we can normally identify based on origin
of data in WAL. The commit 366283961a achieves that for replication
but still the problem can happen during initial sync which is
performed internally via copy. We can't differentiate the data in heap
based on origin. So, we decided to prohibit the subscription
operations that can perform initial sync (ex. Create Subscription,
Alter Subscription ... Refresh) by detecting that the publisher has
subscribed to the same table from some other publisher.

To prohibit the subscription operations, the currently proposed patch
throws an error. Then, it also provides a new copy_data option
'force' under which the user will still be able to perform the
operation. This could be useful when the user intentionally wants to
replicate the initial data even if it contains data from multiple
nodes (for example, when in a multi-node setup, one decides to get the
initial data from just one node and then allow replication of data to
proceed from each of respective nodes).

The other alternative discussed was to just give a warning for
subscription operations and probably document the steps for users to
avoid it. But the problem with that is once the user sees this
warning, it won't be able to do anything except recreate the setup, so
why not give an error in the first place?

Thoughts?

[1] - https://www.postgresql.org/message-id/CALDaNm1eJr6qXT9esVPzgc5Qvy4uMhV4kCCTSmxARKjf%2BMwcnw%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dong Wook Lee 2022-08-02 12:03:58 Re: add test: pg_rowlocks extension
Previous Message Ranier Vilela 2022-08-02 11:55:59 Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)