Re: Handle infinite recursion in logical replication setup

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-07-06 11:39:30
Message-ID: CAA4eK1KDh8Fv5vgWA5NPMVHVb0ptWmRrT5O8dz8qHUW+D6ALng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 5, 2022 at 9:33 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> Since the existing test is already handling the verification of this
> scenario, I felt no need to add the test. Updated v29 patch removes
> the 0001 patch which had the test case.
>

I have again looked at the first and it looks good to me. I would like
to push it after some more review but before that, I would like to
know if someone else has any suggestions/objections to this patch, so
let me summarize the idea of the first patch. It will allow users to
skip the replication of remote data. Here remote data is the data that
the publisher node has received from some other node. The primary use
case is to avoid loops (infinite replication of the same data) among
nodes as shown in the initial email of this thread.

To achieve that this patch adds a new SUBSCRIPTION parameter "origin".
It specifies whether the subscription will request the publisher to
only send changes that originated locally or to send changes
regardless of origin. Setting it to "local" means that the
subscription will request the publisher to only send changes that
originated locally. Setting it to "any" means that the publisher sends
changes regardless of their origin. The default is "any".

Usage:
CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres port=9999'
PUBLICATION pub1 WITH (origin = local);

For now, even though the "origin" parameter allows only "local" and
"any" values, it is implemented as a string type so that the parameter
can be extended in future versions to support filtering using origin
names specified by the user.

This feature allows filtering only the replication data originated
from WAL but for initial sync (initial copy of table data) we don't
have such a facility as we can only distinguish the data based on
origin from WAL. As a separate patch (v29-0002*), we are planning to
forbid the initial sync if we notice that the publication tables were
also replicated from other publishers to avoid duplicate data or
loops. They will be allowed to copy with the 'force' option in such
cases.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2022-07-06 11:54:43 Re: making relfilenodes 56 bits
Previous Message Dilip Kumar 2022-07-06 10:40:10 Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication