Re: Handle infinite recursion in logical replication setup

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-09-02 08:21:20
Message-ID: CAHut+Psku25+Vjz7HiohWxc2WU07O_ZV4voFG+U7WzwKhUzpGQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are my review comments for the latest patch v44-0001.

======

1. doc/src/sgml/logical-replication.sgml

+ <sect1 id="specifying-origins-for-subscription">
+ <title>Specifying origins for subscription</title>

I thought the title is OK, but maybe can be better. OTOH, I am not
sure if my suggestions below are improvements or not. Anyway, even if
the title says the same, the convention is to use uppercase words.

Something like:
"Specifying Origins for Subscriptions"
"Specifying Data Origins for Subscriptions"
"Specifying Data Origins in CREATE SUBSCRIPTION"
"Subscription Data Origins"

~~~

2.

Currently, this new section in this patch is only discussing the
*problems* users might encounter for using copy_data,origin=NONE, but
I think a section with a generic title about "subscription origins"
should be able to stand alone. For example, it should give some brief
mention of what is the meaning/purpose of origin=ANY and origin=NONE.
And it should xref back to CREATE SUBSCRIPTION page.

IMO all this content currently in the patch maybe belongs in a
sub-section of this new section (e.g. call it something like
"Preventing Data Inconsistencies for copy_data,origin=NONE")

~~~

3.

+ To find which tables might potentially include non-local origins (due to
+ other subscriptions created on the publisher) try this SQL query by
+ specifying the publications in IN condition:
+<programlisting>
+SELECT DISTINCT N.nspname AS schemaname, C.relname AS tablename
+FROM pg_publication P,
+ LATERAL pg_get_publication_tables(P.pubname) GPT
+ LEFT JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),
+ pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
+WHERE C.oid = GPT.relid AND PS.srrelid IS NOT NULL AND
+ P.pubname IN (...);
+</programlisting>
+ </para>

3a.
I think the "To find..." sentence should be a new paragraph.

~

3b.
Instead of saying "specifying the publications in IN condition" I
think it might be simpler if those instructions are part of the SQL

SUGGESTION
+<programlisting>
+# substitute <pub-names> below with your publication name(s) to be queried
+SELECT DISTINCT N.nspname AS schemaname, C.relname AS tablename
+FROM pg_publication P,
+ LATERAL pg_get_publication_tables(P.pubname) GPT
+ LEFT JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),
+ pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
+WHERE C.oid = GPT.relid AND PS.srrelid IS NOT NULL AND
+ P.pubname IN (<pub-names>);
+</programlisting>

~

3c.
</programlisting>
</para>

I recall there was a commit or hackers post sometime earlier this year
that reported it is better for these particular closing tags to be
done together like </programlisting></para> because otherwise there is
some case where the whitespace might not render correctly.
Unfortunately, I can't seem to find the evidence of that post/commit,
but I am sure I saw something about this because I have been using
that practice ever since I saw it.

======

4. doc/src/sgml/ref/alter_subscription.sgml

+ <para>
+ Refer to the <xref
linkend="specifying-origins-for-subscription"/> about how
+ <literal>copy_data = true</literal> can interact with the
+ <literal>origin</literal> parameter.
+ </para>

Previously when this xref pointed to the "Note" section the sentence
looked OK (it said, "Refer to the Note about how...") but now the word
is not "Note" anymore, (e.g. "Refer to the Section 31.3 about how
copy_data = true can interact with the origin parameter. "), so I
think this should be tweaked...

"Refer to the XXX about how..." -> "See XXX for details of how..."

======

5. doc/src/sgml/ref/create_subscription.sgml

Save as comment #4 (2 times)

======

6. src/backend/commands/subscriptioncmds.c

+ if (bsearch(&relid, subrel_local_oids,
+ subrel_count, sizeof(Oid), oid_cmp))
+ isnewtable = false;

SUGGESTION (search PG source - there are examples like this)
newtable = bsearch(&relid, subrel_local_oids, subrel_count,
sizeof(Oid), oid_cmp);

~~~

7.

+ if (list_length(publist))
+ {

I think the proper way to test for non-empty List is

if (publist != NIL) { ... }

or simply

if (publist) { ... }

~~~

8.

+ errmsg("subscription \"%s\" requested origin=NONE but might copy
data that had a different origin",

Do you think this should say "requested copy_data with origin=NONE",
instead of just "requested origin=NONE"?

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-09-02 08:26:08 Re: Handle infinite recursion in logical replication setup
Previous Message Kyotaro Horiguchi 2022-09-02 08:19:42 Re: broken table formatting in psql