Re: Handle infinite recursion in logical replication setup

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-09-01 18:53:16
Message-ID: CALDaNm1T5+AVaRYstmJV9=EHAP3qfgCz9ZnQzfh-qk3q975Bxg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 1 Sept 2022 at 09:19, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Aug 31, 2022 at 11:35 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Here are my review comments for v43-0001.
> >
> > ======
> >
> > 1. Commit message
> >
> > The initial copy phase has no way to know the origin of the row data,
> > so if 'copy_data = true' in the step 4 below, log a warning to notify user
> > that potentially non-local data might have been copied.
> >
> > 1a.
> > "in the step 4" -> "in step 4"
> >
> > ~
> >
> > 1b.
> > "notify user" -> "notify the user"
> >
> > ======
> >
> > 2. doc/src/sgml/ref/create_subscription.sgml
> >
> > + <para>
> > + If the subscription is created with <literal>origin = none</literal> and
> > + <literal>copy_data = true</literal>, it will check if the publisher has
> > + subscribed to the same table from other publishers and, if so, log a
> > + warning to notify the user to check the publisher tables. Before continuing
> > + with other operations the user should check that publisher tables did not
> > + have data with different origins to prevent data inconsistency issues on the
> > + subscriber.
> > + </para>
> >
> > I am not sure about that wording saying "to prevent data inconsistency
> > issues" because I thought maybe is already too late because any
> > unwanted origin data is already copied during the initial sync. I
> > think the user can do it try to clean up any problems caused before
> > things become even worse (??)
> >
> > ~~~
> >
> > You asked for my thoughts (see [1] 6b):
> >
> > > There is nothing much a user can do in this case. Only option would be
> > > to take a backup before the operation and restore it and then recreate
> > > the replication setup. I was not sure if we should document these
> > > steps as similar inconsistent data could get created without the
> > > origin option . Thoughts?
> >
> > I think those cases are a bit different:
> >
> > - For a normal scenario (i.e. origin=ANY) then inconsistent data just
> > refers to problems like maybe PK violations so I think you just get
> > what you get and have to deal with the errors...
> >
> > - But when the user said origin=NONE they are specifically saying they
> > do NOT want any non-local data, yet they still might end up getting
> > data contrary to their wishes. So I think maybe there ought to be
> > something documented to warn about this potentially happening. My
> > first impression is that all this seems like a kind of terrible
> > problem because if it happens then cleanup could be difficult - if
> > that subscriber in turn was also a publisher then this bad data might
> > have propagated all over the place already! Anyway, I guess all this
> > could be mentioned in some (new ?) section of the
> > logical-replication.sgml file (i.e. patch 0002).
> >
> > IDEA: I was wondering if the documentation should recommend (or maybe
> > we can even enforce this) that when using origin=NONE the user should
> > always create the subscription with enabled=FALSE. That way if there
> > are some warnings about potential bad origin data then there it might
> > be possible to take some action or change their mind *before* any
> > unwanted data pollutes everything.
> >
> > ======
> >
> > 3. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed
> >
> > + if (count == 0)
> > + tablenames = makeStringInfo();
> > + else
> > + appendStringInfoString(tablenames, ", ");
> > +
> > + appendStringInfo(tablenames, "%s.%s", nspname, relname);
> > + count++;
> >
> > I think the quotes are not correct - each separate table name should be quoted.
> >
> > ~~~
> >
> > 4.
> >
> > + /*
> > + * There might be many tables present in this case, we will display a
> > + * maximum of 100 tables followed by "..." to indicate there might be
> > + * more tables.
> > + */
> > + if (count == 100)
> > + {
> > + appendStringInfoString(tablenames, ", ...");
> > + break;
> > + }
> >
> > 4a.
> > IMO this code should be part of the previous if/else
> >
> > e.g.
> > if (count == 0) ... makeStringInfo()
> > else if (count > 100) ... append ellipsis "..." and break out of the loop
> > else ... append table name to the message
> >
> > Doing it in this way means "..." definitely means there are more than
> > 100 bad tables.
> >
> > ~
> >
> > 4b.
> > Changing like above (#4a) simplifies the comment because it removes
> > doubts like "might be…" since you already know you found the 101st
> > table.
> >
> > SUGGESTION (comment)
> > The message displays a maximum of 100 tables having potentially
> > non-local data. Any more than that will be indicated by "...".
> >
> > ~
> >
>
> Are we using this style of an error message anywhere else in the code?
> If not, then I think we should avoid it here as well as it appears a
> bit adhoc to me in the sense that why restrict at 100 tables. Can we
> instead think of displaying the publications list in the message? So
> errdetail would be something like: Subscribed publications \"%s\" has
> been subscribed from some other publisher. Then we can probably give a
> SQL query for a user to find tables that may data from multiple
> origins especially if that is not complicated. Also, then we can
> change the function name to check_publications_origin().

Modified as suggested.

> Few other minor comments on v43-0001*
> 1.
> + ereport(WARNING,
> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("subscription %s requested origin=NONE but might copy data
> that had a different origin.",
> + subname),
>
> In error message, we don't use full stop at the end.

Modified

> 2.
> + ereport(WARNING,
> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("subscription %s requested origin=NONE but might copy data
> that had a different origin.",
> + subname),
> + errdetail("Subscribed table(s) \"%s\" has been subscribed from some
> other publisher.",
> + tablenames->data),
>
> It is better to use errdetail_plural here as there could be one or
> more objects in this message?

Modified the new errdetail message to errdetail_plural.

Thanks for the comments, the v44 version attached at [1] has the
changes for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm0NRJ1O1cYcZD%3Df7NgynozFprb7zpJSayFN5rcaS44G6Q%40mail.gmail.com

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-09-01 19:07:09 Re: add test: pg_rowlocks extension
Previous Message Nathan Bossart 2022-09-01 18:51:53 make additional use of optimized linear search routines