Re: Handle infinite recursion in logical replication setup

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-06-14 12:04:52
Message-ID: CALDaNm0XtQVX3taeLKWE-gPQyppqs34ipXawAPOyO=he37MQSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 10, 2022 at 10:23 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Below are some review comments for the patch v18-0003
>
> 1. Commit message
>
> This patch does a couple of things:
> change 1) Checks and throws an error if the publication tables were also
> subscribing data in the publisher from other publishers when copy_data
> parameter is specified as 'ON' and origin parameter is specified as
> 'local'.
> change 2) Adds force value for copy_data parameter.
>
> SUGGESTION
> This patch does a couple of things:
> change 1) Checks and throws an error if 'copy_data = on' and 'origin =
> local' but the publication tables were also subscribing data in the
> publisher from other publishers.
> change 2) Adds 'force' value for copy_data parameter.

Modified

> ~~~
>
> 2. Commit message - about the example
>
> All my following review comments for the commit message are assuming
> that the example steps are as they are written in the patch, but
> actually I felt that the example might be more complex than it needed
> to be: e.g
> - You don’t really need the node2 to have data
> - Therefore you don’t need all the added TRUNCATE complications
>
> E.g. I think you only need node1 (with data) and node2 (no data).
>
> Then node1 subscribes node2 with (origin=local, copy_data=off).
> Then node2 subscribes node1 with (origin=local, copy_data=on).
> - Demonstrates exception happens because node1 already had a subscription
> - Demonstrates need for the copy_data=force to override that exception
>
> So please consider using a simpler example for this commit message

Modified

> ~~~
>
> 3. Commit message
>
> The below help us understand how the first change will be useful:
>
> If copy_data parameter was used with 'on' in step 5, then an error
> will be thrown
> to alert the user to prevent inconsistent data being populated:
> CREATE SUBSCRIPTION sub_node2_node1 CONNECTION '<node1 details>'
> PUBLICATION pub_node1 WITH (copy_data = on, origin = local);
> ERROR: CREATE/ALTER SUBSCRIPTION with origin and copy_data as true is not
> allowed when the publisher might have replicated data
>
> SUGGESTION
> The steps below help to demonstrate how the new exception is useful:
>
> The initial copy phase has no way to know the origin of the row data,
> so if 'copy_data = on' in the step 5 below, then an error will be
> thrown to prevent any potentially non-local data from being copied:
> <blank line>
> e.g
> CREATE SUBSCRIPTION ...

Modified

> ~~~
>
> 4. Commit message
>
> The following will help us understand how the second change will be useful:
> Let's take a simple case where user is trying to setup bidirectional logical
> replication between node1 and node2 where the two nodes have some pre-existing
> data like below:
> node1: Table t1 (c1 int) has data 11, 12, 13, 14
> node2: Table t1 (c1 int) has data 21, 22, 23, 24
>
> SUGGESTION
> The following steps help to demonstrate how the 'copy_data = force'
> change will be useful:
> <blank line>
> Let's take a scenario where the user wants to set up bidirectional
> logical replication between node1 and node2 where the same table on
> each node has pre-existing data. e.g.
> node1: Table t1 (c1 int) has data 11, 12, 13, 14
> node2: Table t1 (c1 int) has data 21, 22, 23, 24

Modified

> ~~~
>
> 5. Commit message
>
> step 4:
> node2=# CREATE SUBSCRIPTION sub_node2_node1 Connection '<node1 details>'
> node2-# PUBLICATION pub_node1;
> CREATE SUBSCRIPTION
>
> "Connection" => "CONNECTION"

Modified

> ~~~
>
> 6. Commit message
>
> If table t1 has a unique key, it will cause a unique key
> violation and replication won't proceed.
>
> SUGGESTION
> In case, table t1 has a unique key, it will lead to a unique key
> violation and replication won't proceed.

Modified

> ~~~
>
> 7. Commit message
>
> step 3: Create a subscription in node1 to subscribe to node2. Use
> copy_data specified as on so that the existing table data is copied during
> initial sync:
>
> SUGGESTION
> step 3: Create a subscription in node1 to subscribe to node2. Use
> 'copy_data = on' so that the existing table data is copied during
> initial sync:

Modified

> ~~~
>
> 8. Commit message
>
> step 4: Adjust the publication publish settings so that truncate is not
> published to the subscribers and truncate the table data in node2:
>
> SUGGESTION (only added a comma)
> step 4: Adjust the publication publish settings so that truncate is
> not published to the subscribers, and truncate the table data in
> node2:

This content is not required any more, I have removed it.

> ~~~
>
> 9. Commit message
>
> step 5: Create a subscription in node2 to subscribe to node1. Use copy_data
> specified as force when creating a subscription to node1 so that the existing
> table data is copied during initial sync:
>
> SUGGESTION
> step 5: Create a subscription in node2 to subscribe to node1. Use
> 'copy_data = force' when creating a subscription to node1 so that the
> existing table data is copied during initial sync:

Modified

> ======
>
> 10. doc/src/sgml/ref/create_subscription.sgml
>
> @@ -383,6 +398,15 @@ CREATE SUBSCRIPTION <replaceable
> class="parameter">subscription_name</replaceabl
> can have non-existent publications.
> </para>
>
> + <para>
> + If subscription is created with <literal>origin = local</literal> and
> + <literal>copy_data = on</literal>, it will check if the publisher tables are
> + being subscribed to any other publisher and throw an error to prevent
> + inconsistent data in the subscription. The user can continue with the copy
> + operation without throwing any error in this case by specifying
> + <literal>copy_data = force</literal>.
> + </para>
>
> SUGGESTION (minor rewording)
> If the subscription is created with <literal>origin = local</literal>
> and <literal>copy_data = on</literal>, it will check if the publisher
> tables are being subscribed to any other publisher and, if so, then
> throw an error to prevent possible non-local data from being copied.
> The user can override this check and continue with the copy operation
> by specifying <literal>copy_data = force</literal>.

Modified

> ======
>
> 11. src/backend/commands/subscriptioncmds.c - parse_subscription_options
>
> From [1]:
> >> What about also allowing copy_data = 2, and making it equivalent to "force"?
> > Vignesh: I felt the existing looks ok, no need to support 2. It might confuse the user.
>
> I don't think it would be confusing, but I also don't feel strongly
> enough to debate it. Anyway, I could not find a similar precedent, so
> your decision is fine.

Ok

> ~~~
>
> 12. src/backend/commands/subscriptioncmds.c - parse_subscription_options
>
> @@ -339,17 +406,16 @@ parse_subscription_options(ParseState *pstate,
> List *stmt_options,
> errmsg("%s and %s are mutually exclusive options",
> "connect = false", "create_slot = true")));
>
> - if (opts->copy_data &&
> - IsSet(opts->specified_opts, SUBOPT_COPY_DATA))
> + if (opts->copy_data && IsSet(opts->specified_opts, SUBOPT_COPY_DATA))
> ereport(ERROR,
>
> This is just a formatting change. Is it needed for this patch? patch.

Modified

> ~~~
>
> 13. src/backend/commands/subscriptioncmds.c - AlterSubscription_refresh
>
> @@ -730,7 +798,8 @@ CreateSubscription(ParseState *pstate,
> CreateSubscriptionStmt *stmt,
> * PENDING, to allow ALTER SUBSCRIPTION ... REFRESH
> * PUBLICATION to work.
> */
> - if (opts.twophase && !opts.copy_data && tables != NIL)
> + if (opts.twophase && opts.copy_data == COPY_DATA_OFF &&
> + tables != NIL)
> twophase_enabled = true;
>
> Why is this change needed? I thought the original code is OK now since
> COPY_DATA_OFF = 0

Modified

> ~~~
>
> 14. src/backend/commands/subscriptioncmds.c - AlterSubscription
>
> @@ -1265,7 +1337,8 @@ AlterSubscription(ParseState *pstate,
> AlterSubscriptionStmt *stmt,
> *
> * For more details see comments atop worker.c.
> */
> - if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED && opts.copy_data)
> + if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED &&
> + opts.copy_data)
>
> This is just a formatting change. Is it needed for this patch?

Modified

> ~~~
>
> 15. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed
>
> +
> + if (!origin || (strcmp(origin, "local") != 0) || copydata != COPY_DATA_ON)
> + return;
> +
>
> This condition could be rearranged to put the strcmp last so it is not
> called unless absolutely necessary.

Modified

> ~~~
>
> 16. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed
>
> + appendStringInfoString(&cmd,
> + "SELECT DISTINCT N.nspname AS schemaname, C.relname AS tablename,
> PS.srrelid as replicated\n"
> + "FROM pg_publication P,\n"
>
> The line is too long; needs wrapping.

Modified

> ~~~
>
> 17. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed
>
> + if (!slot_attisnull(slot, 3))
> + ereport(ERROR,
> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("table:%s.%s might have replicated data in the publisher",
> + nspname, relname),
> + errdetail("CREATE/ALTER SUBSCRIPTION with origin as local and
> copy_data as true is not allowed when the publisher might have
> replicated data."),
> + errhint("Use CREATE/ALTER SUBSCRIPTION with copy_data = off or force."));
>
> I felt the errmsg may be easier to read using "=" instead of "as".
> Anyway, it would be more consistent with the errhint. Also, change the
> "true" to "on" to be consistent with the errhint.
>
> SUGGESTION
> errdetail("CREATE/ALTER SUBSCRIPTION with origin = local and copy_data
> = on is not allowed when the publisher might have replicated data."),

Modified

Thanks for the comments, the attached v20 patch has the changes for the same.

Regards,
Vignesh

Attachment Content-Type Size
v20-0001-Add-a-missing-test-to-verify-only-local-paramete.patch text/x-patch 4.3 KB
v20-0004-Document-bidirectional-logical-replication-steps.patch text/x-patch 13.5 KB
v20-0003-Check-and-throw-an-error-if-publisher-tables-wer.patch text/x-patch 36.5 KB
v20-0002-Skip-replication-of-non-local-data.patch text/x-patch 56.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2022-06-14 12:08:50 Re: Handle infinite recursion in logical replication setup
Previous Message r.takahashi_2@fujitsu.com 2022-06-14 10:40:51 RE: Multi-Master Logical Replication