Re: Handle infinite recursion in logical replication setup

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-08-29 06:29:27
Message-ID: CAHut+PvLMRUOwH0nS9c6hKCVhRhYKtMTLrZqG6WPL3SGiRmhGQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are some review comments for patch v42-0001:

======

1. Commit message

A later review comment below suggests some changes to the WARNING
message so if those changes are made then the example in this commit
message also needs to be modified.

======

2. doc/src/sgml/ref/alter_subscription.sgml - copy_data

Refer to the Notes for the usage of true for copy_data parameter and
its interaction with the origin parameter.

I think saying "usage of true" sounded a bit strange.

SUGGESTION
Refer to the Notes about how copy_data = true can interact with the
origin parameter.

======

3. doc/src/sgml/ref/create_subscription.sgml - copy data

Refer to the Notes for the usage of true for copy_data parameter and
its interaction with the origin parameter.

SUGGESTION
(same as #2)

~~

4. doc/src/sgml/ref/create_subscription.sgml - origin

Refer to the Notes for the usage of true for copy_data parameter and
its interaction with the origin parameter.

SUGGESTION
(same as #2)

~~~

5. doc/src/sgml/ref/create_subscription.sgml - Notes

+ <para>
+ If the subscription is created with <literal>origin = none</literal> and
+ <literal>copy_data = true</literal>, it will check if the publisher has
+ subscribed to the same table from other publishers and, if so, throw a
+ warning to notify user to check the publisher tables. The user can ensure

"throw a warning to notify user" -> "log a warning to notify the user"

~~

6.

+ warning to notify user to check the publisher tables. The user can ensure
+ that publisher tables do not have data which has an origin associated before
+ continuing with any other operations to prevent inconsistent data being
+ replicated.
+ </para>

6a.

I'm not so sure about this. IMO the warning is not about the
replication – really it is about the COPY which has already happened
anyway. So the user can't really prevent anything from going wrong;
instead, they have to take some action to clean up if anything did go
wrong.

SUGGESTION
Before continuing with other operations the user should check that
publisher tables did not have data with different origins, otherwise
inconsistent data may have been copied.

~

6b.

I am also wondering what can the user do now. Assuming there was bad
COPY then the subscriber table has already got unwanted stuff copied
into it. Is there any advice we can give to help users fix this mess?

======

7. src/backend/commands/subscriptioncmds.c - AlterSubscription_refresh

+ /* Check whether we can allow copy of newly added relations. */
+ check_pub_table_subscribed(wrconn, sub->publications, copy_data,
+ sub->origin, subrel_local_oids,
+ subrel_count);

"whether we can allow" seems not quite the right wording here anymore,
because now there is no ERROR stopping this - so if there was unwanted
data the COPY will proceed to copy it anyhow...

~~~

8. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed

@@ -1781,6 +1794,121 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
table_close(rel, RowExclusiveLock);
}

+/*
+ * Check and throw a warning if the publisher has subscribed to the same table
+ * from some other publisher. This check is required only if "copy_data = true"
+ * and "origin = none" for CREATE SUBSCRIPTION and
+ * ALTER SUBSCRIPTION ... REFRESH statements to notify user that data having
+ * origin might have been copied.

"throw a warning" -> "log a warning"

"to notify user" -> "to notify the user" ?

~~~

9.

+ if (copydata == false || !origin ||
+ (pg_strcasecmp(origin, LOGICALREP_ORIGIN_NONE) != 0))
+ return;

SUGGESTION
if (!copydata || !origin ||
(pg_strcasecmp(origin, LOGICALREP_ORIGIN_NONE) != 0))

~~~

10. (Question)

I can't tell just by reading the code if FOR ALL TABLES case is
handled – e.g. will this recognise the case were the publisher might
have table data from other origins because it has a subscription on
some other node that was publishing "FOR ALL TABLES"?

~~~

11.

+ ereport(WARNING,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("publisher has subscribed table \"%s.%s\" from some other publisher",
+ nspname, relname),
+ errdetail("Publisher might have subscribed one or more tables from
some other publisher."),
+ errhint("Verify that these publisher tables do not have data that
has an origin associated before proceeding to avoid inconsistency."));
+ break;

11a.
I'm not sure having that "break" code is correct logic. Won't that
mean the user will only get a warning for the first potential problem
encountered but then other potential problems tables will have no
warnings at all so the user may not be made aware of them? Perhaps you
need to first gather the list of all the suspicious tables before
logging a single warning which shows that list? Or perhaps you need to
log multiple warnings – one warning per suspicious table?

~

11b.
The WARNING message seems a bit inside-out because the errmsg is
giving more details than the errdetail.

SUGGESTIONS (or something similar)
errmsg - "subscription XXX requested origin=NONE but may have copied
data that had a different origin."
errdetail – Publisher YYY has subscribed table \"%s.%s\" from some
other publisher"

~

11c.
The errhint sentence is unusual.

BEFORE
"Verify that these publisher tables do not have data that has an
origin associated before proceeding to avoid inconsistency."

SUGGESTION (or something similar)
Before proceeding, verify that initial data copied from the publisher
tables did not come from other origins.

======

12. src/test/subscription/t/030_origin.pl

+# have remotely originated data from node_A. We throw a warning, in this case,
+# to draw attention to there being possible remote data.

"throw a warning" -> "log a warning" (this occurs 2x)

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2022-08-29 06:30:23 Re: Making Vars outer-join aware
Previous Message Amit Kapila 2022-08-29 06:17:28 Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns