Re: pg_upgrade and logical replication

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: pg_upgrade and logical replication
Date: 2023-04-13 10:04:19
Message-ID: 20230413100419.sygjck3vd4xtn52f@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, Apr 13, 2023 at 12:42:05PM +1000, Peter Smith wrote:
> Here are some review comments for patch v4-0001 (not the test code)

Thanks!

>
> (There are some overlaps here with what Kuroda-san already posted
> yesterday because we were looking at the same patch code. Also, a few
> of my comments might become moot points if refactoring will be done
> according to Kuroda-san's "general" questions).

Ok, for the record, the parts I don't reply to are things I fully agree with
and already changed locally.

> ======
> Commit message
>
> 1.
> To fix this problem, this patch teaches pg_dump in binary upgrade mode to emit
> additional commands to be able to restore the content of pg_subscription_rel,
> and addition LSN parameter in the subscription creation to restore the
> underlying replication origin remote LSN. The LSN parameter is only accepted
> in CREATE SUBSCRIPTION in binary upgrade mode.
>
> ~
>
> SUGGESTION
> To fix this problem, this patch teaches pg_dump in binary upgrade mode
> to emit additional ALTER SUBSCRIPTION commands to facilitate restoring
> the content of pg_subscription_rel, and provides an additional LSN
> parameter for CREATE SUBSCRIPTION to restore the underlying
> replication origin remote LSN. The new ALTER SUBSCRIPTION syntax and
> new LSN parameter are not exposed to the user -- they are only
> accepted in binary upgrade mode.

Thanks, I eventually adapted a bit more the suggested wording:

To fix this problem, this patch teaches pg_dump in binary upgrade mode to emit
additional ALTER SUBSCRIPTION subcommands that will restore the content of
pg_subscription_rel, and also provides an additional LSN parameter for CREATE
SUBSCRIPTION to restore the underlying replication origin remote LSN. The new
ALTER SUBSCRIPTION subcommand and the new LSN parameter are not exposed to
users and only accepted in binary upgrade mode.

The new ALTER SUBSCRIPTION subcommand has the following syntax:

> 2b.
> The link renders strangely. It just says:
>
> See the subscription part in the [section called "Notes"] for more information.
>
> Maybe the link part can be rewritten so that it renders more nicely,
> and also makes mention of pg_dump.

Yes I saw that. I didn't try to look at it yet but that's indeed what I wanted
to do eventually.

> ======
> src/backend/commands/subscriptioncmds.c
>
> 3.
> +#define SUBOPT_RELID 0x00008000
> +#define SUBOPT_STATE 0x00010000
>
> Maybe 'SUBOPT_RELSTATE' is a better name for this per-relation state option?

I looked at it but part of the existing code is already using state as a
variable name, to be consistent with pg_subscription_rel.srsubstate. I think
it's better to use the same pattern in this patch.

> 6.
> + if (strlen(state_str) != 1)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("invalid relation state used")));
>
> IIUC this is syntax not supposed to be reachable by user input. Maybe
> there is some merit in making the errors similar looking to the normal
> options, but OTOH it could also be misleading.

It doesn't cost much and may be helpful for debugging so I will use error
messages similar to the error facing ones.

> This might as well just be: Assert(strlen(state_str) == 1 &&
> *state_str == SUBREL_STATE_READY);
> or even simply: Assert(IsBinaryUpgrade);

As I mentioned in a previous email, it's still unclear to me whether the
restriction on the srsubstate will be weakened or not, so I prefer to keep such
part of the code generic and have the restriction centralized in the pg_upgrade
check.

I added some Assert(IsBinaryUpgrade) in those code path as it may not be
evident in this place that it's a requirement.

> 7. CreateSubscription
>
> + if(IsBinaryUpgrade)
> + supported_opts |= SUBOPT_LSN;
> parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
> 7b.
> I wonder if this was deserving of a comment something like "The LSN
> option is for internal use only"...

I was thinking that being valid only for IsBinaryUpgrade would be enough?

> 8. CreateSubscription
>
> + originid = replorigin_create(originname);
> +
> + if (IsBinaryUpgrade && IsSet(opts.lsn, SUBOPT_LSN))
> + replorigin_advance(originid, opts.lsn, InvalidXLogRecPtr,
> + false /* backward */ ,
> + false /* WAL log */ );
>
> I think the 'IsBinaryUpgrade' check is redundant here because
> SUBOPT_LSN is not possible to be set unless that is true anyhow.

It's indeed redundant for now, but it's also used as a safeguard if some code
is changed. Maybe just having an assert(IsBinaryUpgrade) would be better
though.

While looking at it I noticed that this code was never reached, as I should
have checked IsSet(opts.specified_opts, ...). I fixed that and added a TAP
test to make sure that the restored remote_lsn is the same as on the old
subscription node.

> 9. AlterSubscription
>
> + AddSubscriptionRelState(subid, opts.relid, opts.state,
> + opts.lsn);
>
> This line wrapping of AddSubscriptionRelState seems unnecessary.

Without it the line reaches 81 characters :(

> ======
> src/bin/pg_dump/pg_backup.h
>
> 10.
> +
> + bool preserve_subscriptions;
> } DumpOptions;
>
>
> Maybe name this field "preserve_subscription_state" for consistency
> with the option name.

That's what I thought when I first wrote that code but I quickly had to use a
shorter name to avoid bloating the line length everywhere.

> ======
> src/bin/pg_dump/pg_dump.c
>
> 11. dumpSubscription
>
> if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
> + {
> + for (i = 0; i < subinfo->nrels; i++)
> + {
> + appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
> + "(relid = %u, state = '%c'",
> + qsubname,
> + subinfo->subrels[i].srrelid,
> + subinfo->subrels[i].srsubstate);
> +
> + if (subinfo->subrels[i].srsublsn[0] != '\0')
> + appendPQExpBuffer(query, ", LSN = '%s'",
> + subinfo->subrels[i].srsublsn);
> +
> + appendPQExpBufferStr(query, ");");
> + }
> +
>
> Maybe I misunderstood something -- Shouldn't this new ALTER
> SUBSCRIPTION TABLE cmd only be happening when the option
> dopt->preserve_subscriptions is true?

It indirectly is, as in that case subinfo->nrels is guaranteed to be 0. I just
tried to keep the code simpler and avoid too many nested conditions.

> 12b.
> Should include the indent file typdefs.list in the patch, and add this
> new typedef to it.

FTR I checked and there wasn't too many noise when running pgindent on the
touched files, so I already locally added the new typedef and ran pgindent.

> 14.
> + /* No subscription before pg10. */
> + if (GET_MAJOR_VERSION(cluster->major_version < 1000))
> + return;
>
> 14a.
> The existing checking code seems slightly different to this because
> the other check_XXX calls are guarded by the GET_MAJOR_VERSION before
> being called.

No opinion on that, so I moved all the checks on the caller side.

> 14b.
> Furthermore, I was confused about the combination when the < PG10 and
> user_opts.preserve_subscriptions is true. Since this is just a return
> (not an error) won't the subsequent pg_dump still attempt to use that
> option (--preserve-subscriptions) even though we already know it
> cannot work?

Will it error out though? I haven't tried but I think it will just silently do
nothing, which maybe isn't ideal, but may be somewhat expected if you try to
preserve something that doesn't exist.

> Would it be better to give an ERROR saying -preserve-subscriptions is
> incompatible with the old PG version?

I'm not opposed to adding some error, but I don't really know where it would
really be suitable. Maybe in the same code path explicitly error out if the
preserve subscription option is used with a pg10- source server?

> 15b.
> I guess it would be more useful if the message can include the names
> of the failing subscription and/or the relation that was in the wrong
> state. Maybe that means moving all this checking logic into the
> pg_dump code?

I think it's better to have the checks only once, so in pg_upgrade, but I'm not
strongly opposed to duplicate those tests if there's any complaint. In the
meantime I rephrased the warning to give the name of the problematic
subscription (but not the list of relation, as it's more likely to be a long
list and it's easy to check manually afterwards and/or wait for all sync to
finish).

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2023-04-13 10:35:26 Re: [PATCH] Use role name "system_user" instead of "user" for unsafe_tests
Previous Message Alvaro Herrera 2023-04-13 09:58:51 Re: Clean up hba.c of code freeing regexps