Re: Handle infinite recursion in logical replication setup

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-05-20 10:00:54
Message-ID: CALDaNm1YV6gddD=pWp_qddh23xAueO0JigZ0gCeTYw+CarRE4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 18, 2022 at 4:22 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, May 18, 2022 at 10:29 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Wed, May 4, 2022 at 12:17 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >
> > > Thanks for the comments, the attached v13 patch has the changes for the same.
> > >
> >
> > Few comments on v13-0001
> > ======================
> >
>
> Few comments on v13-0002
> ===========================
> 1.
> The steps
> + to create a two-node bidirectional replication are given below:
> + </para>
>
> The steps given after this will be successful only when there is no
> data in any of the nodes and that is not clear by reading docs.

Modified

> 2.
> + <sect2 id="add-new-node-data-in-existing-node">
> + <title>Adding a new node when data is present in the existing nodes</title>
> + <para>
> + Adding a new node <literal>node3</literal> to the existing
> + <literal>node1</literal> and <literal>node2</literal> when data is present
> + in existing nodes <literal>node1</literal> and <literal>node2</literal>
> + needs similar steps. The only change required here is that
> + <literal>node3</literal> should create a subscription with
> + <literal>copy_data = force</literal> to one of the existing nodes to
> + receive the existing data during initial data synchronization.
> + </para>
>
> I think the steps for these require the user to lock the required
> tables/database (or in some other way hold the operations on required
> tables) for node-2 till the time setup is complete, otherwise, node-3
> might miss some data. It seems the same is also missing in the
> section: "Adding a new node when data is present in the new node".

Modified. Mentioned that lock is required on new node-3 and node-2. I
have mentioned node-3 also should be locked so that no operation is
happened in node-3, else there can be some dml operation after
truncate which is sent to node-1 and the same data is synced when
create subscription in node-3 subscribing to the publisher in node-1.

> 3.
> +
> + <sect2 id="add-node-data-present-in-new-node">
> + <title>Adding a new node when data is present in the new node</title>
> ...
> ...
>
> + <para>
> + Create a subscription in <literal>node1</literal> to subscribe to
> + <literal>node3</literal>. Use <literal>copy_data</literal> specified as
> + <literal>force</literal> so that the existing table data is copied during
> + initial sync:
> +<programlisting>
> +node1=# CREATE SUBSCRIPTION sub_node1_node3
> +node1-# CONNECTION 'dbname=foo host=node3 user=repuser'
> +node1-# PUBLICATION pub_node3
> +node1-# WITH (copy_data = force, local_only = on);
> +CREATE SUBSCRIPTION
> +</programlisting></para>
> +
> + <para>
> + Create a subscription in <literal>node2</literal> to subscribe to
> + <literal>node3</literal>. Use <literal>copy_data</literal> specified as
> + <literal>force</literal> so that the existing table data is copied during
> + initial sync:
> +<programlisting>
> +node2=# CREATE SUBSCRIPTION sub_node2_node3
> +node2-# CONNECTION 'dbname=foo host=node3 user=repuser'
> +node2-# PUBLICATION pub_node3
> +node2-# WITH (copy_data = force, local_only = on);
> +CREATE SUBSCRIPTION
> +</programlisting></para>
>
> Why do we need to use "copy_data = force" here? AFAIU, unless, we
> create any subscription on node-3, we don't need the 'force' option.

Modified to on.

> 4. We should have a generic section to explain how users can add a new
> node using the new options to the existing set of nodes in all cases.
> For example, say when the existing set of nodes has some data and the
> new node also has some pre-existing data. I think the basic steps are
> something like: a. create a required publication(s) on the new node.
> (b) create subscriptions on existing nodes pointing to publication on
> the new node with the local_only option as true and copy_data = on.
> (c) wait for data to be copied from the new node to existing nodes.
> (d) Truncate the data on the new node. (e) create subscriptions
> corresponding to each of the existing publisher nodes on the new node
> with local_only as true and copy_data = force for one of the nodes.
> (f) One needs to ensure that there is no new activity on required
> tables/database (either by locking or in some other way) in all the
> nodes for which copy_data option is kept as false while creating
> subscriptions in the previous step to avoid any data loss.

Added

> We also need to mention in Notes that as all operations are not
> transactional, user is advised to take backup of existing data to
> avoid any inconsistency.

Modified

> 5.
> * It is quite possible that subscriber has not yet pulled data to
> + * the tables, but in ideal cases the table data will be subscribed.
> + * To keep the code simple it is not checked if the subscriber table
> + * has pulled the data or not.
> + */
> + if (copydata == COPY_DATA_ON && local_only && !slot_attisnull(slot, 3))
>
> Sorry, but I don't understand what you intend to say by the above
> comment. Can you please explain?

When the user specifies copy_data as on, we should check if the
publisher has the publication tables being subscribed from a remote
publisher. If so throw an error as it remote origin data present.
Ex:
Node1 - pub1 for table t1 -- no data
Node2 - Sub1 subscribing to data from pub1
Node2 - pub2 for table t1 -- no data
Node3 - create subscription to Node2 with copy_data = ON

In this case even though the table does not have any remote origin
data, as Node2 is subscribing to data from Node1, throw an error.
We throw an error irrespective of data present in Node1 or not to keep
the code simple.

> 6. I feel we can explain the case with only two nodes in the commit
> message, why do we need to use a three-node case?

Modified

Thanks for the comments, the v14 patch attached at [1] has the changes
for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm0xuYy35vOudVHBjov3fQ%3DjBRHJHKUUN9VarqO%3DYqtaxg%40mail.gmail.com

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2022-05-20 10:01:31 Re: PG15 beta1 sort performance regression due to Generation context change
Previous Message vignesh C 2022-05-20 09:45:42 Re: Handle infinite recursion in logical replication setup