From: | vignesh C <vignesh21(at)gmail(dot)com> |
---|---|
To: | "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com> |
Cc: | Greg Nancarrow <gregn4422(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Smith, Peter" <peters(at)fast(dot)au(dot)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Optionally automatically disable logical replication subscriptions on error |
Date: | 2021-11-12 04:08:35 |
Message-ID: | CALDaNm0PdzqVSy0uUZ+UKd9rQeQONUjkALj3Ge+fAmdw0cSLgA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Nov 11, 2021 at 2:50 PM osumi(dot)takamichi(at)fujitsu(dot)com
<osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
>
> On Wednesday, November 10, 2021 1:23 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> > On Wed, Nov 10, 2021 at 12:26 PM osumi(dot)takamichi(at)fujitsu(dot)com
> > <osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
> > >
> > > On Monday, November 8, 2021 10:15 PM vignesh C <vignesh21(at)gmail(dot)com>
> > wrote:
> > > > Thanks for the updated patch. Please create a Commitfest entry for
> > > > this. It will help to have a look at CFBot results for the patch,
> > > > also if required rebase and post a patch on top of Head.
> > > As requested, created a new entry for this - [1]
> > >
> > > FYI: the skip xid patch has been updated to v20 in [2] but the v3 for
> > > disable_on_error is not affected by this update and still applicable
> > > with no regression.
> > >
> > > [1] - https://commitfest.postgresql.org/36/3407/
> > > [2] -
> > >
> > https://www.postgresql.org/message-id/CAD21AoAT42mhcqeB1jPfRL1%2B
> > EUHbZ
> > > k8MMY_fBgsyZvJeKNpG%2Bw%40mail.gmail.com
> >
> > I had a look at this patch and have a couple of initial review comments for some
> > issues I spotted:
> Thank you for checking it.
>
>
> > src/backend/commands/subscriptioncmds.c
> > (1) bad array entry assignment
> > The following code block added by the patch assigns
> > "values[Anum_pg_subscription_subdisableonerr - 1]" twice, resulting in it
> > being always set to true, rather than the specified option value:
> >
> > + if (IsSet(opts.specified_opts, SUBOPT_DISABLE_ON_ERR)) {
> > + values[Anum_pg_subscription_subdisableonerr - 1]
> > + = BoolGetDatum(opts.disableonerr);
> > + values[Anum_pg_subscription_subdisableonerr - 1]
> > + = true;
> > + }
> >
> > The 2nd line is meant to instead be
> > "replaces[Anum_pg_subscription_subdisableonerr - 1] = true".
> > (compare to handling for other similar options)
> Oops, fixed.
>
> > src/backend/replication/logical/worker.c
> > (2) unreachable code?
> > In the patch code there seems to be some instances of unreachable code after
> > re-throwing errors:
> >
> > e.g.
> >
> > + /* If we caught an error above, disable the subscription */ if
> > + (disable_subscription) {
> > + ReThrowError(DisableSubscriptionOnError(cctx));
> > + MemoryContextSwitchTo(ecxt);
> > + }
> >
> > + else
> > + {
> > + PG_RE_THROW();
> > + MemoryContextSwitchTo(ecxt);
> > + }
> >
> >
> > + if (disable_subscription)
> > + {
> > + ReThrowError(DisableSubscriptionOnError(cctx));
> > + MemoryContextSwitchTo(ecxt);
> > + }
> >
> > I'm guessing it was intended to do the "MemoryContextSwitch(ecxt);"
> > before re-throwing (?), but it's not really clear, as in the 1st and 3rd cases, the
> > DisableSubscriptionOnError() calls anyway immediately switch the memory
> > context to cctx.
> You are right I think.
> Fixed based on an idea below.
>
> After an error happens, for some additional work
> (e.g. to report the stats of table sync/apply worker
> by pgstat_report_subworker_error() or
> to update the catalog by DisableSubscriptionOnError())
> restore the memory context that is used before the error (cctx)
> and save the old memory context of error (ecxt). Then,
> do the additional work and switch the memory context to the ecxt
> just before the rethrow. As you described,
> in contrast to PG_RE_THROW, DisableSubscriptionOnError() changes
> the memory context immediatedly at the top of it,
> so for this case, I don't call the MemoryContextSwitchTo().
>
> Another important thing as my modification
> is a case when LogicalRepApplyLoop failed and
> apply_error_callback_arg.command == 0. In the original
> patch of skip xid, it just calls PG_RE_THROW()
> but my previous v3 codes missed this macro in this case.
> Therefore, I've fixed this part as well.
>
> C codes are checked by pgindent.
>
> Note that this depends on the v20 skip xide patch in [1]
>
Thanks for the updated patch, Few comments:
1) tab completion should be added for disable_on_error:
/* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
"enabled", "slot_name", "streaming",
"synchronous_commit", "two_phase");
2) disable_on_error is supported by alter subscription, the same
should be documented:
@ -871,11 +886,19 @@ AlterSubscription(ParseState *pstate,
AlterSubscriptionStmt *stmt,
{
supported_opts = (SUBOPT_SLOT_NAME |
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
-
SUBOPT_STREAMING);
+
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR);
parse_subscription_options(pstate,
stmt->options,
supported_opts, &opts);
+ if (IsSet(opts.specified_opts,
SUBOPT_DISABLE_ON_ERR))
+ {
+
values[Anum_pg_subscription_subdisableonerr - 1]
+ =
BoolGetDatum(opts.disableonerr);
+
replaces[Anum_pg_subscription_subdisableonerr - 1]
+ = true;
+ }
+
3) Describe subscriptions (dRs+) should include displaying of disableonerr:
\dRs+ sub1
List of subscriptions
Name | Owner | Enabled | Publication | Binary | Streaming | Two
phase commit | Synchronous commit | Conninfo
------+---------+---------+-------------+--------+-----------+------------------+--------------------+---------------------------
sub1 | vignesh | t | {pub1} | f | f | d
| off | dbname=postgres port=5432
(1 row)
4) I felt transicent should be transient, might be a typo:
+ Specifies whether the subscription should be automatically disabled
+ if replicating data from the publisher triggers non-transicent errors
+ such as referential integrity or permissions errors. The default is
+ <literal>false</literal>.
5) The commented use PostgresNode and use TestLib can be removed:
+# Test of logical replication subscription self-disabling feature
+use strict;
+use warnings;
+# use PostgresNode;
+# use TestLib;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 10;
Regards,
Vignesh
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2021-11-12 04:11:01 | Re: Data is copied twice when specifying both child and parent table in publication |
Previous Message | Amit Kapila | 2021-11-12 03:43:59 | Re: Parallel vacuum workers prevent the oldest xmin from advancing |