Re: Handle infinite recursion in logical replication setup

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-03-07 09:31:15
Message-ID: CAA4eK1+wazGaemoxN=79xOOQLOWZ0p-gzkwdb58HuGqvDK0qOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 7, 2022 at 1:11 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Mon, Mar 7, 2022 at 10:15 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > I haven't yet gone through the patch, but I have a question about the
> > > > idea. Suppose I want to set up a logical replication like,
> > > > node1->node2->node3->node1. So how would I create the subscriber at
> > > > node1? only_local=on or off?. I mean on node1, I want the changes
> > > > from node3 which are generated on node3 or which are replicated from
> > > > node2 but I do not want changes that are replicated from node1 itself?
> > > > So if I set only_local=on then node1 will not get the changes
> > > > replicated from node2, is that right? and If I set only_local=off then
> > > > it will create the infinite loop again? So how are we protecting
> > > > against this case?
> > > >
> > >
> > > In the above topology if you want local changes from both node3 and
> > > node2 then I think the way to get that would be you have to create two
> > > subscriptions on node1. The first one points to node2 (with
> > > only_local=off) and the second one points to node3 (with only_local
> > > =off).
> > >
> >
> > Sorry, I intend to say 'only_local=on' at both places in my previous email.
>
> Hmm okay, so for this topology we will have to connect node1 directly
> to node2 as well as to node3 but can not cascade the changes. I was
> wondering can it be done without using the extra connection between
> node2 to node1? I mean instead of making this a boolean flag that
> whether we want local change or remote change, can't we control the
> changes based on the origin id? Such that node1 will get the local
> changes of node3 but with using the same subscription it will get
> changes from node3 which are originated from node2 but it will not
> receive the changes which are originated from node1.
>

Good point. I think we can provide that as an additional option to
give more flexibility but we won't be able to use it for initial sync
where we can't differentiate between data from different origins.
Also, I think as origins are internally generated, we may need some
better way to expose it to users so that they can specify it as an
option. Isn't it better to provide first some simple way like a
boolean option so that users have some way to replicate the same table
data among different nodes without causing an infinite loop and then
extend it as you are suggesting or may be in some other ways as well?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message osumi.takamichi@fujitsu.com 2022-03-07 09:37:09 RE: Optionally automatically disable logical replication subscriptions on error
Previous Message Peter Smith 2022-03-07 08:57:47 Re: Handle infinite recursion in logical replication setup