Re: [PoC] pg_upgrade: allow to upgrade publisher node

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-08-22 06:01:41
Message-ID: CAHut+PtyA6k13+TCcMzy8W073mWn2MXS7H_bydshO_T+ozh-8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Kuroda-san,

Here are some review comments for patch v23-0002

======
1. GENERAL

Please try to run a spell/grammar check on all the text like commit message
and docs changes before posting (e.g. cut/paste the rendered text into some
tool like MSWord or Grammarly or ChatGPT or whatever tool you like and
cross-check). There are lots of small typos etc but one up-front check
could avoid long cycles of
reviewing/reporting/fixing/re-posting/confirming...

======
Commit message

2.
Note that slot restoration must be done after the final pg_resetwal command
during the upgrade because pg_resetwal will remove WALs that are required by
the slots. Due to ths restriction, the timing of restoring replication
slots is
different from other objects.

~

/ths/this/

======
doc/src/sgml/ref/pgupgrade.sgml

3.
+ <para>
+ Before you start upgrading the publisher cluster, ensure that the
+ subscription is temporarily disabled, by executing
+ <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ...
DISABLE</command></link>.
+ After the upgrade is complete, then re-enable the subscription. Note
that
+ if the new cluser uses different port number from old one,
+ <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ...
CONNECTION</command></link>
+ command must be also executed on subscriber.
+ </para>

3a.
BEFORE
After the upgrade is complete, then re-enable the subscription.

SUGGESTION
Re-enable the subscription after the upgrade.

~

3b.
/cluser/cluster/

~

3c.
Note that
+ if the new cluser uses different port number from old one,
+ <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ...
CONNECTION</command></link>
+ command must be also executed on subscriber.

SUGGESTION
Note that if the new cluster uses a different port number ALTER
SUBSCRIPTION ... CONNECTION command must be also executed on the subscriber.

~~~

4.
+ <listitem>
+ <para>
+ <structfield>confirmed_flush_lsn</structfield> (see <xref
linkend="view-pg-replication-slots"/>)
+ of all slots on old cluster must be same as latest checkpoint
location.
+ </para>
+ </listitem>

4a.
/on old cluster/on the old cluster/

~

4b.
/as latest/as the latest/
~~

5.
+ <listitem>
+ <para>
+ The output plugins referenced by the slots on the old cluster must
be
+ installed on the new PostgreSQL executable directory.
+ </para>
+ </listitem>

/installed on/installed in/ ??

~~

6.
+ <listitem>
+ <para>
+ The new cluster must have
+ <link
linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+ configured to value larger than the existing slots on the old
cluster.
+ </para>
+ </listitem>

BEFORE
...to value larger than the existing slots on the old cluster.

SUGGESTION
...to a value greater than or equal to the number of slots present on the
old cluster.

======
src/bin/pg_upgrade/check.c

7. GENERAL - check_for_logical_replication_slots

AFAICT this function is called *only* for the new_cluster, yet there is no
Assert and no checking inside this function to ensure that is the case or
not. It seems strange that the *cluster is passed as an argument but then
the whole function body and messages assume it can only be a new cluster
anyway.

IMO it would be better to rename this function to something like
check_new_cluster_logical_replication_slots() and DO NOT pass any parameter
but just use the global new_cluster within the function body.

~~~

8. check_for_logical_replication_slots

+ /* logical replication slots can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(cluster->major_version) < 1700)
+ return;

Start comment with uppercase for consistency.

~~~

9. check_for_logical_replication_slots

+ res = executeQueryOrDie(conn, "SELECT slot_name "
+ "FROM pg_catalog.pg_replication_slots "
+ "WHERE slot_type = 'logical' AND "
+ "temporary IS FALSE;");
+
+ if (PQntuples(res))
+ pg_fatal("New cluster must not have logical replication slot, but found
\"%s\"",
+ PQgetvalue(res, 0, 0));

/replication slot/replication slots/

~

10. check_for_logical_replication_slots

+ /*
+ * Do additional checks when the logical replication slots have on the old
+ * cluster.
+ */
+ if (nslots)

SUGGESTION
Do additional checks when there are logical replication slots on the old
cluster.

~~~

11.
+ if (nslots > max_replication_slots)
+ pg_fatal("max_replication_slots must be greater than or equal to existing
logical "
+ "replication slots on old cluster.");

11a.
SUGGESTION
max_replication_slots (%d) must be greater than or equal to the number of
logical replication slots (%d) on the old cluster.

~

11b.
I think it would be helpful for the current values to be displayed in the
fatal message so the user will know more about what value to set. Notice
that my above suggestion has some substitution markers.

======
src/bin/pg_upgrade/info.c

12.
+static void
+print_slot_infos(LogicalSlotInfoArr *slot_arr)
+{
+ int slotnum;
+
+ for (slotnum = 0; slotnum < slot_arr->nslots; slotnum++)
+ {
+ LogicalSlotInfo *slot_info = &slot_arr->slots[slotnum];
+ pg_log(PG_VERBOSE, "slotname: \"%s\", plugin: \"%s\", two_phase: %d",
+ slot_info->slotname,
+ slot_info->plugin,
+ slot_info->two_phase);
+ }
+}

Better to have a blank line after the 'slot_info' declaration.

======
.../pg_upgrade/t/003_logical_replication_slots.pl

13.
+# ------------------------------
+# TEST: Confirm pg_upgrade fails when new cluster wal_level is not
'logical'
+
+# Create a slot on old cluster
+$old_publisher->start;
+$old_publisher->safe_psql('postgres',
+ "SELECT pg_create_logical_replication_slot('test_slot1', 'test_decoding',
false, true);"
+);
+$old_publisher->stop;

13a.
It would be nicer if all the test parts have identical formats. So here it
should also say

# Preparations for the subsequent test:
# 1. Create a slot on the old cluster

~

13b.
Notice the colon (:) at the end of that comment "Preparations for the
subsequent test:". All the other preparation comments in this file should
also have a colon.

~

14.
+# Cause a failure at the start of pg_upgrade because wal_level is replica

SUGGESTION
# pg_upgrade will fail because the new cluster wal_level is 'replica'

~~~

15.
+# 1. Create an unnecessary slot on the old cluster

(but it is not unnecessary -- it is necessary for this test!)

SUGGESTION
+# 1. Create a second slot on the old cluster

~~~

16.
+# Cause a failure at the start of pg_upgrade because the new cluster has
+# insufficient max_replication_slots

SUGGESTION
# pg_upgrade will fail because the new cluster has insufficient
max_replication_slots

~~~

17.
+# Preparations for the subsequent test.
+# 1. Remove an unnecessary slot

SUGGESTION
+# 1. Remove the slot 'test_slot2', leaving only 1 slot remaining on the
old cluster, so the new cluster config max_replication_slots=1 will now be
enough.

~~~

18.
+$new_publisher->start;
+my $result = $new_publisher->safe_psql('postgres',
+ "SELECT slot_name, two_phase FROM pg_replication_slots");
+is($result, qq(test_slot1|t), 'check the slot exists on new cluster');
+$new_publisher->stop;
+
+done_testing();

Maybe should be some added comments like:
# Check that the slot 'test_slot1' has migrated to the new cluster.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2023-08-22 06:38:09 Re: Support run-time partition pruning for hash join
Previous Message Amit Kapila 2023-08-22 06:01:12 Re: [PoC] pg_upgrade: allow to upgrade publisher node