Re: Handle infinite recursion in logical replication setup

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-07-26 03:37:04
Message-ID: CAA4eK1LLkd3wHsw2J820ZDMsN5yw+avO25Cj_zZMqUKkwKigMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 26, 2022 at 7:13 AM Jonathan S. Katz <jkatz(at)postgresql(dot)org> wrote:
>
> On 7/25/22 4:54 AM, vignesh C wrote:
> >
> > Let's take a simple case to understand why copy_data = force is
> > required to replicate between two primaries for table t1 which has
> > data as given below:
>
> > step4 - Node-2:
> > Create Subscription sub2 Connection '<node-1 details>' Publication
> > pub1_2 with (origin = none, copy_data=on);
> > If we had allowed the create subscription to be successful with
> > copy_data = on. After this the data will be something like this:
> > Node-1:
> > 1, 2, 3, 4, 5, 6, 7, 8
> >
> > Node-2:
> > 1, 2, 3, 4, 5, 6, 7, 8, 5, 6, 7, 8
> >
> > So, you can see that data on Node-2 (5, 6, 7, 8) is duplicated. In
> > case, table t1 has a unique key, it will lead to a unique key
> > violation and replication won't proceed.
> >
> > To avoid this we will throw an error:
> > ERROR: could not replicate table "public.t1"
> > DETAIL: CREATE/ALTER SUBSCRIPTION with origin = none and copy_data =
> > on is not allowed when the publisher has subscribed same table.
> > HINT: Use CREATE/ALTER SUBSCRIPTION with copy_data = off/force.
>
> Thanks for the example. I agree that it is fairly simple to reproduce.
>
> I understand that "copy_data = force" is meant to protect a user from
> hurting themself. I'm not convinced that this is the best way to do so.
>
> For example today I can subscribe to multiple publications that write to
> the same table. If I have a primary key on that table, and two of the
> subscriptions try to write an identical ID, we conflict. We don't have
> any special flags or modes to guard against that from happening, though
> we do have documentation on conflicts and managing them.
>
> AFAICT the same issue with "copy_data" also exists in the above scenario
> too, even without the "origin" attribute.
>

That's true but there is no parameter like origin = NONE which
indicates that constraint violations or duplicate data problems won't
occur due to replication. In the current case, I think the situation
is different because a user has specifically asked not to replicate
any remote data by specifying origin = NONE, which should be dealt
differently. Note that current users or their setup won't see any
difference/change unless they specify the new parameter origin as
NONE.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-07-26 03:42:23 Re: Handle infinite recursion in logical replication setup
Previous Message Kyotaro Horiguchi 2022-07-26 02:49:05 Re: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?