Re: Handle infinite recursion in logical replication setup

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-09-02 10:33:44
Message-ID: CALDaNm0yZyntd43mAON3FE9niStW-LUZHE3+dKVRzRroMd6kcw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2 Sept 2022 at 13:51, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are my review comments for the latest patch v44-0001.
>
> ======
>
> 1. doc/src/sgml/logical-replication.sgml
>
> + <sect1 id="specifying-origins-for-subscription">
> + <title>Specifying origins for subscription</title>
>
> I thought the title is OK, but maybe can be better. OTOH, I am not
> sure if my suggestions below are improvements or not. Anyway, even if
> the title says the same, the convention is to use uppercase words.
>
> Something like:
> "Specifying Origins for Subscriptions"
> "Specifying Data Origins for Subscriptions"
> "Specifying Data Origins in CREATE SUBSCRIPTION"
> "Subscription Data Origins"

Modified to "Preventing Data Inconsistencies for copy_data,
origin=NONE" to avoid confusions

> ~~~
>
> 2.
>
> Currently, this new section in this patch is only discussing the
> *problems* users might encounter for using copy_data,origin=NONE, but
> I think a section with a generic title about "subscription origins"
> should be able to stand alone. For example, it should give some brief
> mention of what is the meaning/purpose of origin=ANY and origin=NONE.
> And it should xref back to CREATE SUBSCRIPTION page.
>
> IMO all this content currently in the patch maybe belongs in a
> sub-section of this new section (e.g. call it something like
> "Preventing Data Inconsistencies for copy_data,origin=NONE")

I wanted to just document the inconsistency issue when copy_data and
origin is used. I felt the origin information was already present in
create subscription page. I have not documented the origin again here.
I have renamed the section to "Preventing Data Inconsistencies for
copy_data, origin=NONE" to avoid any confusions.

> ~~~
>
> 3.
>
> + To find which tables might potentially include non-local origins (due to
> + other subscriptions created on the publisher) try this SQL query by
> + specifying the publications in IN condition:
> +<programlisting>
> +SELECT DISTINCT N.nspname AS schemaname, C.relname AS tablename
> +FROM pg_publication P,
> + LATERAL pg_get_publication_tables(P.pubname) GPT
> + LEFT JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),
> + pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
> +WHERE C.oid = GPT.relid AND PS.srrelid IS NOT NULL AND
> + P.pubname IN (...);
> +</programlisting>
> + </para>
>
> 3a.
> I think the "To find..." sentence should be a new paragraph.

modified

> ~
>
> 3b.
> Instead of saying "specifying the publications in IN condition" I
> think it might be simpler if those instructions are part of the SQL
>
> SUGGESTION
> +<programlisting>
> +# substitute <pub-names> below with your publication name(s) to be queried
> +SELECT DISTINCT N.nspname AS schemaname, C.relname AS tablename
> +FROM pg_publication P,
> + LATERAL pg_get_publication_tables(P.pubname) GPT
> + LEFT JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),
> + pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
> +WHERE C.oid = GPT.relid AND PS.srrelid IS NOT NULL AND
> + P.pubname IN (<pub-names>);
> +</programlisting>

modified, I could not use <pub-name> since it was considering it as a
tag, instead I have changed it to pub-names

> ~
>
> 3c.
> </programlisting>
> </para>
>
> I recall there was a commit or hackers post sometime earlier this year
> that reported it is better for these particular closing tags to be
> done together like </programlisting></para> because otherwise there is
> some case where the whitespace might not render correctly.
> Unfortunately, I can't seem to find the evidence of that post/commit,
> but I am sure I saw something about this because I have been using
> that practice ever since I saw it.

Modified

> ======
>
> 4. doc/src/sgml/ref/alter_subscription.sgml
>
> + <para>
> + Refer to the <xref
> linkend="specifying-origins-for-subscription"/> about how
> + <literal>copy_data = true</literal> can interact with the
> + <literal>origin</literal> parameter.
> + </para>
>
> Previously when this xref pointed to the "Note" section the sentence
> looked OK (it said, "Refer to the Note about how...") but now the word
> is not "Note" anymore, (e.g. "Refer to the Section 31.3 about how
> copy_data = true can interact with the origin parameter. "), so I
> think this should be tweaked...
>
> "Refer to the XXX about how..." -> "See XXX for details of how..."

Modified

> ======
>
> 5. doc/src/sgml/ref/create_subscription.sgml
>
> Save as comment #4 (2 times)

Modified

> ======
>
> 6. src/backend/commands/subscriptioncmds.c
>
> + if (bsearch(&relid, subrel_local_oids,
> + subrel_count, sizeof(Oid), oid_cmp))
> + isnewtable = false;
>
> SUGGESTION (search PG source - there are examples like this)
> newtable = bsearch(&relid, subrel_local_oids, subrel_count,
> sizeof(Oid), oid_cmp);

This code has been removed as part of another comment, the comment no
more applies.

> ~~~
>
> 7.
>
> + if (list_length(publist))
> + {
>
> I think the proper way to test for non-empty List is
>
> if (publist != NIL) { ... }
>
> or simply
>
> if (publist) { ... }

Modified

> ~~~
>
> 8.
>
> + errmsg("subscription \"%s\" requested origin=NONE but might copy
> data that had a different origin",
>
> Do you think this should say "requested copy_data with origin=NONE",
> instead of just "requested origin=NONE"?

Modified

Thanks for the comments, the v45 patch attached at [1] has the changes
for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm3fwW4mHGfpZfVLaHe_tSavOSPqntD5XPwO%2B_jwScrj_g%40mail.gmail.com

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2022-09-02 10:50:12 Re: [PATCH] Compression dictionaries for JSONB
Previous Message vignesh C 2022-09-02 10:29:01 Re: Handle infinite recursion in logical replication setup