Re: Handle infinite recursion in logical replication setup

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-06-14 12:08:50
Message-ID: CALDaNm2EqCwLuHbS_k7i5ORr4h0K=BVJ6OATgn6JxUrg=9rfLw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 10, 2022 at 2:09 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Below are some review comments for the patch v18-0004
>
> ======
>
> 1. Commit message
>
> Document the steps for the following:
> a) Create a two-node bidirectional replication when there is no data in both the
> nodes.
> b) Adding a new node when there is no data in any of the nodes.
> c) Adding a new node when data is present in the existing nodes.
> d) Generic steps to add a new node to the existing set of nodes.
>
> SUGGESTION (minor changes to make the tense consistent)
> Documents the steps for the following:
> a) Creating a two-node bidirectional replication when there is no data
> in both nodes.
> b) Adding a new node when there is no data in any of the nodes.
> c) Adding a new node when data is present in the existing nodes.
> d) Generic steps for adding a new node to an existing set of nodes.

Modified

> ======
>
> 2. doc/src/sgml/logical-replication.sgml - blurb
>
> + <para>
> + Bidirectional replication is useful in creating a multi-master database
> + which helps in performing read/write operations from any of the nodes.
>
> SUGGESTION
> Bidirectional replication is useful for creating a multi-master
> database environment for replicating read/write operations performed
> by any of the member nodes.

Modified

> ~~~
>
> 3. doc/src/sgml/logical-replication.sgml - warning
>
> + <warning>
> + <para>
> + Setting up bidirectional logical replication across nodes
> requires multiple
> + steps to be performed on various nodes. Because not all operations are
> + transactional, the user is advised to take backups.
> + </para>
> + </warning>
>
> SUGGESTION (but keep your formatting)
> Setting up bidirectional logical replication requires multiple steps
> to be performed on various nodes. Because...

Modified

> ~~~
>
> 4. doc/src/sgml/logical-replication.sgml - Setting bidirectional
> replication between two nodes
>
> + <para>
> + The steps to create a two-node bidirectional replication when there is no
> + data in both the nodes <literal>node1</literal> and
> + <literal>node2</literal> are given below:
> + </para>
>
> SUGGESTION (but keep your formatting)
> The following steps demonstrate how to create a two-node bidirectional
> replication when there is no table data present in both nodes node1
> and node2:

Modified

> ~~~
>
> 5. doc/src/sgml/logical-replication.sgml - Setting bidirectional
> replication between two nodes
>
> + <para>
> + Lock the required tables of <literal>node1</literal> and
> + <literal>node2</literal> in <literal>EXCLUSIVE</literal> mode until the
> + setup is completed.
> + </para>
>
> Instead of "the required tables" shouldn't this just say "table t1"?

Modified

> ~~~
>
> 6. doc/src/sgml/logical-replication.sgml - Setting bidirectional
> replication between two nodes
>
> + <para>
> + Now the bidirectional logical replication setup is complete between
> + <literal>node1</literal>, <literal>node2</literal> and
> + <literal>node2</literal>. Any subsequent changes in one node will
> + replicate the changes to the other nodes.
> + </para>
>
> SUGGESTION (for 2nd sentence, and keep your formatting)
> Any incremental changes from node1 will be replicated to node2, and
> any incremental changes from node2 will be replicated to node1.

Modified

> ~~~
>
> 7. doc/src/sgml/logical-replication.sgml - Adding a new node when
> there is no data in any of the nodes
>
> + <para>
> + Adding a new node <literal>node3</literal> to the existing
> + <literal>node1</literal> and <literal>node2</literal> when there is no data
> + in any of the nodes requires setting up subscription in
> + <literal>node1</literal> and <literal>node2</literal> to replicate the data
> + from <literal>node3</literal> and setting up subscription in
> + <literal>node3</literal> to replicate data from <literal>node1</literal>
> + and <literal>node2</literal>.
> + </para>
>
> SUGGESTION (but keep your formatting)
> The following steps demonstrate adding a new node node3 to the
> existing node1 and node2 when there is no t1 data in any of the nodes.
> This requires creating subscriptions in node1 and node2 to replicate
> the data from node3 and creating subscriptions in node3 to replicate
> data from node1 and node2.

Modified

> ~~~
>
> 8. doc/src/sgml/logical-replication.sgml - Adding a new node when
> there is no data in any of the nodes
>
> + <note>
> + <para>
> + It is assumed that the bidirectional logical replication between
> + <literal>node1</literal> and <literal>node2</literal> is already
> completed.
> + </para>
> + </note>
>
> IMO this note should just be a text note in the previous paragraph
> instead of an SGML <note>. e.g. "Note: These steps assume that..."

Modified

> ~~~
>
> 9. doc/src/sgml/logical-replication.sgml - Adding a new node when
> there is no data in any of the nodes
>
> + <para>
> + Lock the required tables of all the nodes <literal>node1</literal>,
> + <literal>node2</literal> and <literal>node3</literal> in
> + <literal>EXCLUSIVE</literal> mode until the setup is completed.
> + </para>
>
> Instead of "the required tables" shouldn't this just say "table t1"?

Modified

> ~~~
>
> 10. doc/src/sgml/logical-replication.sgml - Adding a new node when
> there is no data in any of the nodes
>
> + <para>
> + Now the bidirectional logical replication setup is complete between
> + <literal>node1</literal>, <literal>node2</literal> and
> + <literal>node2</literal>. Any subsequent changes in one node will
> + replicate the changes to the other nodes.
> + </para>
>
> SUGGESTION (2nd sentence)
> Incremental changes made in any node will be replicated to the other two nodes.

Modified

> ~~~
>
> 11. doc/src/sgml/logical-replication.sgml - Adding a new node when
> data is present in the existing nodes
>
> + <para>
> + Adding a new node <literal>node3</literal> which has no data to the
> + existing <literal>node1</literal> and <literal>node2</literal> when data
> + is present in existing nodes <literal>node1</literal> and
> + <literal>node2</literal> needs similar steps. The only change required
> + here is that <literal>node3</literal> should create a subscription with
> + <literal>copy_data = force</literal> to one of the existing nodes to
> + receive the existing data during initial data synchronization.
> + </para>
>
> SUGGESTION (but keep your formatting)
> The following steps demonstrate adding a new node node3 which has no
> t1 data to the existing node1 and node2 where t1 data is present. This
> needs similar steps; the only change required here is that node3
> should create a subscription with copy_data = force to one of the
> existing nodes so it can receive the existing t1 data during initial
> data synchronization.

Modified

> ~~~
>
> 12 doc/src/sgml/logical-replication.sgml - Adding a new node when data
> is present in the existing nodes
>
> + <note>
> + <para>
> + It is assumed that the bidirectional logical replication between
> + <literal>node1</literal> and <literal>node2</literal> is already
> completed.
> + The nodes <literal>node1</literal> and <literal>node2</literal> has some
> + pre-existing data in table t1 that is synchronized in both the nodes.
> + </para>
> + </note>
>
> 12a.
> IMO this note should just be text in the previous paragraph instead of
> an SGML <note>. e.g. "Note: These steps assume that..."

Modified

> 12b.
> SUGGESTION (minor rewording; keep your formatting)
> Note: These steps assume that the bidirectional logical replication
> between node1 and node2 is already completed, and the pre-existing
> data in table t1 is already synchronized in both those nodes.

Modified

> ~~~
>
> 13. doc/src/sgml/logical-replication.sgml - Adding a new node when
> data is present in the existing nodes
>
> + <para>
> + Lock the required tables of <literal>node2</literal> and
> + <literal>node3</literal> in <literal>EXCLUSIVE</literal> mode until the
> + setup is completed. No need to lock the tables in <literal>node1</literal>
> + as any data changes made will be synchronized while creating the
> + subscription with <literal>copy_data</literal> specified as
> + <literal>force</literal>.
> + </para>
>
> 13a.
> Instead of "the required tables" shouldn't this just say "table t1"?

Modified

> 13b.
> SUGGESTION (2nd sentence; keep your formatting)
> There is no need to lock table t1 in node1 because any data changes
> made will be synchronized while creating the subscription with
> copy_data = force.

Modified

> ~~~
>
> 14. doc/src/sgml/logical-replication.sgml - Adding a new node when
> data is present in the existing nodes
>
> + <para>
> + Create a subscription in <literal>node3</literal> to subscribe to
> + <literal>node1</literal>. Use <literal>copy_data</literal> specified as
> + <literal>force</literal> so that the existing table data is
> + copied during initial sync:
>
> SUGGESTION (2nd sentence; keep your formatting)
> Use copy_data = force so that the existing table data is copied during
> the initial sync:

Modified

> ~~~
>
> 15. doc/src/sgml/logical-replication.sgml - Adding a new node when
> data is present in the existing nodes
>
> + <para>
> + Create a subscription in <literal>node3</literal> to subscribe to
> + <literal>node2</literal>. Use <literal>copy_data</literal> specified as
> + <literal>off</literal> because the initial table data would have been
> + already copied in the previous step:
>
> SUGGESTION (2nd sentence; keep your formatting)
> Use copy_data = off because the initial table data would have been
> already copied in the previous step:

Modified

> ~~~
>
> 16. doc/src/sgml/logical-replication.sgml - Adding a new node when
> data is present in the existing nodes
>
> + <para>
> + Now the bidirectional logical replication setup is complete between
> + <literal>node1</literal>, <literal>node2</literal> and
> + <literal>node2</literal>. Any subsequent changes in one node will
> + replicate the changes to the other nodes.
> + </para>
>
> 16a.
> Should say node1, node2 and node3

Modified it in v19

> 16b.
> SUGGESTION (2nd sentence – same as the previous comment)
> Incremental changes made in any node will be replicated to the other two nodes.

Modified

> ~~~
>
> 17. doc/src/sgml/logical-replication.sgml - Adding a new node when
> data is present in the new node
>
> + <warning>
> + <para>
> + Adding a new node <literal>node3</literal> to the existing
> + <literal>node1</literal> and <literal>node2</literal> when data is present
> + in the new node <literal>node3</literal> is not possible.
> + </para>
> + </warning>
>
> 17a.
> IMO
> - Not really necessary to name the nodes, because this is a generic statement
> - Maybe say "not supported" instead of "not possible". e.g. there is
> no ERROR check for this case is there?
>
> SUGGESTION
> Adding a new node when data is present in the new node tables is not supported.

Modified

> 17b.
> I am not sure but I felt this advice seemed more like an SGML <note>;
> note a <warning>

Modified

> ~~~
>
> 18. doc/src/sgml/logical-replication.sgml - Generic steps to add a new
> node to the existing set of nodes
>
> + <title>Generic steps to add a new node to the existing set of nodes</title>
>
> SUGGESTION
> Generic steps for adding a new node to an existing set of nodes

Modified

> ~~~
>
> 19. doc/src/sgml/logical-replication.sgml - Generic steps to add a new
> node to the existing set of nodes
>
> + <para>
> + 1. Create the required publication on the new node.
> + </para>
> + <para>
> + 2. Lock the required tables of the new node in <literal>EXCLUSIVE</literal>
> + mode until the setup is complete. This is required to prevent any
> + modifications from happening in the new node. If data modifications occur
> + after step-3, there is a chance that the modifications will be published to
> + the first node and then synchronized back to the new node while creating
> + subscription in step-5 resulting in inconsistent data.
> + </para>
> + <para>
> + 3. Create subscriptions on existing nodes pointing to publication on
> + the new node with <literal>origin</literal> parameter specified as
> + <literal>local</literal> and <literal>copy_data</literal> specified as
> + <literal>off</literal>.
> + </para>
> + <para>
> + 4. Lock the required tables of the existing nodes except the first node in
> + <literal>EXCLUSIVE</literal> mode until the setup is complete. This is
> + required to prevent any modifications from happening. If data modifications
> + occur, there is a chance that the modifications done in between step-5 and
> + step-6 will not be synchronized to the new node resulting in inconsistent
> + data. No need to lock the tables in the first node as any data changes
> + made will be synchronized while creating the subscription with
> + <literal>copy_data</literal> specified as <literal>force</literal>.
> + </para>
> + <para>
> + 5. Create a subscription on the new node pointing to publication on the
> + first node with <literal>origin</literal> parameter specified as
> + <literal>local</literal> and <literal>copy_data</literal> parameter
> + specified as <literal>force</literal>.
> + </para>
> + <para>
> + 6. Create subscriptions on the new node pointing to publications on the
> + remaining nodes with <literal>origin</literal> parameter specified as
> + <literal>local</literal> and <literal>copy_data</literal> parameter
> + specifiedas <literal>off</literal>.
> + </para>
>
> Following suggestions make the following changes:
> - Some minor rewording
> - Change the step names
> - Use "=" instead of "specified as" for the parameters
> - I felt it is more readable if the explanatory notes are separate
> paragraphs from the steps. Maybe if you could indent them or something
> it would be even better.
> - Added a couple more explanatory notes
>
> SUGGESTION (using those above changes; please keep your formatting)
>
> Step-1: Create a publication on the new node.
>
> Step-2: Lock the required tables of the new node in EXCLUSIVE mode
> until the setup is complete.
>
> (This lock is necessary to prevent any modifications from happening in
> the new node because if data modifications occurred after Step-3,
> there is a chance that the modifications will be published to the
> first node and then synchronized back to the new node while creating
> the subscription in Step-5. This would result in inconsistent data).
>
> Step-3. Create subscriptions on existing nodes to the publication on
> the new node with origin = local and copy_data = off.
>
> (The copy_data = off is OK here because it is asserted that the
> published tables of the new node will have no pre-existing data).
>
> Step-4. Lock the required tables of the existing nodes except the
> first node in EXCLUSIVE mode until the setup is complete.
>
> (This lock is necessary to prevent any modifications from happening.
> If data modifications occur, there is a chance that modifications done
> between Step-5 and Step-6 will not be synchronized to the new node.
> This would result in inconsistent data. There is no need to lock table
> t1 in node1 because any data changes made will be synchronized while
> creating the subscription with copy_data = force).
>
> Step-5. Create a subscription on the new node to the publication on
> the first node with origin = local and copy_data = force.
>
> (This will copy the same table data from the existing nodes to the new node)
>
> Step-6. Create subscriptions on the new node to publications on the
> remaining nodes with origin = local and copy_data = off.
>
> (The copy_data = off is OK here because the existing node data was
> already copied to the new node in Step-5)

Modified

> ======
>
> 20. doc/src/sgml/ref/create_subscription.sgml
>
> - <literal>copy_data = force</literal>.
> + <literal>copy_data = force</literal>. Refer to the
> + <xref linkend="logical-replication-bidirectional"/> on how
> + <literal>copy_data</literal> and <literal>origin</literal> can be used
> + in bidirectional replication.
> </para>
>
> SUGGESTION (keep your formatting)
> Refer to <xref linkend="logical-replication-bidirectional"/> for how
> <literal>copy_data</literal> and <literal>origin</literal> can be used
> in bidirectional replication.

Modified

Thanks for the comments. The comments are handled as part of the v20
patch attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm0XtQVX3taeLKWE-gPQyppqs34ipXawAPOyO%3Dhe37MQSg%40mail.gmail.com

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2022-06-14 12:42:55 Re: [v15 beta] pg_upgrade failed if earlier executed with -c switch
Previous Message vignesh C 2022-06-14 12:04:52 Re: Handle infinite recursion in logical replication setup