Re: pg_upgrade and logical replication

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: pg_upgrade and logical replication
Date: 2023-11-10 13:56:18
Message-ID: CALDaNm0mGz6_69BiJTmEqC8Q0U0x2nMZOs3w9btKOHZZpfC2ow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 9 Nov 2023 at 12:23, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Thu, Nov 09, 2023 at 01:14:05PM +1100, Peter Smith wrote:
> > Looks like v12 accidentally forgot to update this to the modified
> > function name 'binary_upgrade_add_sub_rel_state'
>
> This v12 is overall cleaner than its predecessors. Nice to see.
>
> +my $result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t1");
> +is($result, qq(1), "check initial t1 table data on publisher");
> +$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t2");
> +is($result, qq(1), "check initial t1 table data on publisher");
> +$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
> +is($result, qq(1), "check initial t1 table data on the old subscriber");
> +$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
>
> I'd argue that t1 and t2 should have less generic names. t1 is used
> to check that the upgrade process works, while t2 is added to the
> publication after upgrading the subscriber. Say something like
> tab_upgraded or tab_not_upgraded?

Modified

> +my $synced_query =
> + "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r');";
>
> Perhaps it would be safer to use a query that checks the number of
> relations in 'r' state? This query would return true if
> pg_subscription_rel has no tuples.

Modified

> +# Table will be in 'd' (data is being copied) state as table sync will fail
> +# because of primary key constraint error.
> +my $started_query =
> + "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
>
> Relying on a pkey error to enforce an incorrect state is a good trick.
> Nice.

That was better way to get data sync state without manually changing
the pg_subscription_rel catalog

> +command_fails(
> + [
> + 'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
> + '-D', $new_sub->data_dir, '-b', $bindir,
> + '-B', $bindir, '-s', $new_sub->host,
> + '-p', $old_sub->port, '-P', $new_sub->port,
> + $mode, '--check',
> + ],
> + 'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state');
> +rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
>
> Okay by me to not stop the cluster for the --check to shave a few
> cycles. It's a bit sad that we don't cross-check the contents of
> subscription_state.txt before removing pg_upgrade_output.d. Finding
> the file is easy even if the subdir where it is included is not a
> constant name. Then it is possible to apply a regexp with the
> contents consumed by a slurp_file().

Modified

> +my $remote_lsn = $old_sub->safe_psql('postgres',
> + "SELECT remote_lsn FROM pg_replication_origin_status");
> Perhaps you've not noticed, but this would be 0/0 most of the time.
> However the intention is to check after a valid LSN to make sure that
> the origin is set, no?

I have added few more inserts to make remote_lsn not be 0/0

> I am wondering whether this should use a bit more data than just one
> tuple, say at least two transaction, one of them with a multi-value
> INSERT?

Added one more multi-insert

> +# ------------------------------------------------------
> +# Check that pg_upgrade is successful when all tables are in ready state.
> +# ------------------------------------------------------
> This comment is a bit inconsistent with the state that are accepted,
> but why not, at least that's predictible.

The key test validation is mentioned in this style of comment

> + * relation to pg_subscripion_rel table. This will be used only in
>
> Typo: s/pg_subscripion_rel/pg_subscription_rel/.

Modified

> This needs some word-smithing to explain the reasons why a state is
> not needed:
>
> + /*
> + * The subscription relation should be in either i (initialize),
> + * r (ready) or s (synchronized) state as either the replication slot
> + * is not created or the replication slot is already dropped and the
> + * required WAL files will be present in the publisher. The other
> + * states are not ok as the worker has dependency on the replication
> + * slot/origin in these case:
>
> A slot not created yet refers to the 'i' state, while 'r' and 's'
> refer to a slot created previously but already dropped, right?
> Shouldn't this comment tell that rather than mixing the assumptions?

Modified

> + * a) SUBREL_STATE_DATASYNC: In this case, the table sync worker will
> + * try to drop the replication slot but as the replication slots will
> + * be created with old subscription id in the publisher and the
> + * upgraded subscriber will not be able to clean the slots in this
> + * case.
>
> Proposal: A relation upgraded while in this state would retain a
> replication slot, which could not be dropped by the sync worker
> spawned after the upgrade because the subscription ID tracked by the
> publisher does not match anymore.

Modified

> Note: actually, this would be OK if we are able to keep the OIDs of
> the subscribers consistent across upgrades? I'm OK to not do nothing
> about that in this patch, to keep it simpler. Just asking in passing.

I will analyze more on this and post the analysis in the subsequent mail.

> + * b) SUBREL_STATE_FINISHEDCOPY: In this case, the tablesync worker will
> + * expect the origin to be already existing as the origin is created
> + * with an old subscription id, tablesync worker will not be able to
> + * find the origin in this case.
>
> Proposal: A tablesync worker spawned to work on a relation upgraded
> while in this state would expect an origin ID with the OID of the
> subscription used before the upgrade, causing it to fail.

Modified

> + "A list of problem subscriptions is in the file:\n"
>
> Sounds a bit strange, perhaps use an extra "the", as of "the problem
> subscriptions"?

Modified

> Could it be worth mentioning in the docs that one could also DISABLE
> the subscriptions before running the upgrade?

I felt since the changes that we are planning to make won't start the
apply workers during upgrade, there will be no impact even if the
subscriptions are enabled. I felt no need to mention it unless we are
planning to allow starting of apply workers during upgrade.

> + The replication origin entry corresponding to each of the subscriptions
> + should exist in the old cluster. This can be found by checking
> + <link linkend="catalog-pg-subscription">pg_subscription</link> and
> + <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
> + system tables.
>
> Hmm. No need to mention pg_replication_origin_status?

When we create origin, the origin status would be created implicitly,
I felt we need not check on replication origin status and also need
not mention it here.

> If I may ask, how did you check that the given relation states were
> OK or not OK? Did you hardcode some wait points in tablesync.c up to
> where a state is updated in pg_subscription_rel, then shutdown the
> cluster before the upgrade to maintain the catalog in this state?
> Finally, after the upgrade, you've cross-checked the dependencies on
> the slots and origins to see that the spawned sync workers turned
> crazy because of the inconsistencies. Right?

I did testing in the same lines that you mentioned. Apart from that I
also reviewed the design where it was using the old subscription id
like in case of table sync workers, the tables sync worker will use
replication using old subscription id. replication slot and
replication origin. I also checked the impact of remote_lsn's.
Few example: IN SUBREL_STATE_DATASYNC state we will try to drop the
replication slot once worker is started but since the slot will be
created with an old subscription, we will not be able to drop the
replication slot and create a leak. Similarly the problem exists with
SUBREL_STATE_FINISHEDCOPY where we will not be able to drop the origin
created with an old sub id.

Thanks for the comments, the attached v13 version patch has the
changes for the same.

Regards,
Vignesh

Attachment Content-Type Size
v13-0001-Preserve-the-full-subscription-s-state-during-pg.patch text/x-patch 40.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2023-11-10 14:01:38 Re: pg_upgrade and logical replication
Previous Message Laurenz Albe 2023-11-10 12:43:51 Re: Bug: RLS policy FOR SELECT is used to check new rows