Re: Handle infinite recursion in logical replication setup

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-03-07 09:55:18
Message-ID: CAFiTN-tSNRp8i8Zou3V3nh0c3QD9FByKSTaC7Y5cgSnX9X3rSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 7, 2022 at 3:01 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Mar 7, 2022 at 1:11 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Mon, Mar 7, 2022 at 10:15 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > I haven't yet gone through the patch, but I have a question about the
> > > > > idea. Suppose I want to set up a logical replication like,
> > > > > node1->node2->node3->node1. So how would I create the subscriber at
> > > > > node1? only_local=on or off?. I mean on node1, I want the changes
> > > > > from node3 which are generated on node3 or which are replicated from
> > > > > node2 but I do not want changes that are replicated from node1 itself?
> > > > > So if I set only_local=on then node1 will not get the changes
> > > > > replicated from node2, is that right? and If I set only_local=off then
> > > > > it will create the infinite loop again? So how are we protecting
> > > > > against this case?
> > > > >
> > > >
> > > > In the above topology if you want local changes from both node3 and
> > > > node2 then I think the way to get that would be you have to create two
> > > > subscriptions on node1. The first one points to node2 (with
> > > > only_local=off) and the second one points to node3 (with only_local
> > > > =off).
> > > >
> > >
> > > Sorry, I intend to say 'only_local=on' at both places in my previous email.
> >
> > Hmm okay, so for this topology we will have to connect node1 directly
> > to node2 as well as to node3 but can not cascade the changes. I was
> > wondering can it be done without using the extra connection between
> > node2 to node1? I mean instead of making this a boolean flag that
> > whether we want local change or remote change, can't we control the
> > changes based on the origin id? Such that node1 will get the local
> > changes of node3 but with using the same subscription it will get
> > changes from node3 which are originated from node2 but it will not
> > receive the changes which are originated from node1.
> >
>
> Good point. I think we can provide that as an additional option to
> give more flexibility but we won't be able to use it for initial sync
> where we can't differentiate between data from different origins.
> Also, I think as origins are internally generated, we may need some
> better way to expose it to users so that they can specify it as an
> option. Isn't it better to provide first some simple way like a
> boolean option so that users have some way to replicate the same table
> data among different nodes without causing an infinite loop and then
> extend it as you are suggesting or may be in some other ways as well?

Yeah, that makes sense that first we provide some simple mechanism to
enable it and we can extend it later.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2022-03-07 10:44:00 Re: [PoC] Let libpq reject unexpected authentication requests
Previous Message Julien Rouhaud 2022-03-07 09:48:55 Re: suboverflowed subtransactions concurrency performance optimize