Re: [PoC] pg_upgrade: allow to upgrade publisher node

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-08-25 08:43:47
Message-ID: CAHut+PtQcou3Bfm9A5SbhFuo2uKK-6u4_j_59so3skAi8Ns03A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are my review comments for patch v25-0002.

In general, I feel where possible the version checking is best done in
the "check.c" file (the filename is a hint). Most of the review
comments below are repeating this point.

======
Commit message.

1.
I felt this should mention the limitation that the slot upgrade
feature is only supported from PG17 slots upwards.

======
doc/src/sgml/ref/pgupgrade.sgml

2.
+ <para>
+ <application>pg_upgrade</application> attempts to migrate logical
+ replication slots. This helps avoid the need for manually defining the
+ same replication slots on the new publisher. Currently,
+ <application>pg_upgrade</application> supports migrate logical replication
+ slots when the old cluster is 17.X and later.
+ </para>

Currently, <application>pg_upgrade</application> supports migrate
logical replication slots when the old cluster is 17.X and later.

SUGGESTION
Migration of logical replication slots is only supported when the old
cluster is version 17.0 or later.

======
src/bin/pg_upgrade/check.c

3. GENERAL

IMO all version checking for this feature should only be done within
this "check.c" file as much as possible.

The detailed reason for this PG17 limitation can be in the file header
comment of "pg_upgrade.c", and then all the version checks can simply
say something like:
"Logical slot migration is only support for slots in PostgreSQL 17.0
and later. See atop file pg_upgrade.c for an explanation of this
limitation "

~~~

4. check_and_dump_old_cluster

+ /* Extract a list of logical replication slots */
+ get_logical_slot_infos();
+

IMO the version checking should only be done in the "checking"
functions, so it should be removed from the within
get_logical_slot_infos() and put here in the caller.

SUGGESTION

/* Logical slots can be migrated since PG17. */
if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
{
/* Extract a list of logical replication slots */
get_logical_slot_infos();
}

~~~

5. check_new_cluster_logical_replication_slots

+check_new_cluster_logical_replication_slots(void)
+{
+ PGresult *res;
+ PGconn *conn;
+ int nslots = count_logical_slots();
+ int max_replication_slots;
+ char *wal_level;
+
+ /* Quick exit if there are no logical slots on the old cluster */
+ if (nslots == 0)
+ return;

IMO the version checking should only be done in the "checking"
functions, so it should be removed from the count_logical_slots() and
then this code should be written more like this:

SUGGESTION (notice the quick return comment change too)

int nslots = 0;

/* Logical slots can be migrated since PG17. */
if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
nslots = count_logical_slots();

/* Quick return if there are no logical slots to be migrated. */
if (nslots == 0)
return;

======
src/bin/pg_upgrade/info.c

6. GENERAL

For the sake of readability it might be better to make the function
names more explicit:

get_logical_slot_infos() -> get_old_cluster_logical_slot_infos()
count_logical_slots() -> count_old_cluster_logical_slots()

~~~

7. get_logical_slot_infos

+/*
+ * get_logical_slot_infos()
+ *
+ * Higher level routine to generate LogicalSlotInfoArr for all databases.
+ *
+ * Note: This function will not do anything if the old cluster is pre-PG 17.
+ * The logical slots are not saved at shutdown, and the confirmed_flush_lsn is
+ * always behind the SHUTDOWN_CHECKPOINT record. Subsequent checks done in
+ * check_for_confirmed_flush_lsn() would raise a FATAL error if such slots are
+ * included.
+ */
+void
+get_logical_slot_infos(void)

Move all this detailed explanation about the limitation to the
file-level comment in "pg_upgrade.c". See also review comment #3.

~~~

8. get_logical_slot_infos

+void
+get_logical_slot_infos(void)
+{
+ int dbnum;
+
+ /* Logical slots can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+ return;

IMO the version checking is best done in the "checking" functions. See
previous review comments about the caller of this. If you want to put
something here, then just have an Assert:

Assert(GET_MAJOR_VERSION(old_cluster.major_version) >= 1700);

~~~

9. count_logical_slots

+/*
+ * count_logical_slots()
+ *
+ * Sum up and return the number of logical replication slots for all databases.
+ */
+int
+count_logical_slots(void)
+{
+ int dbnum;
+ int slot_count = 0;
+
+ /* Quick exit if the version is prior to PG17. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+ return 0;
+
+ for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+ slot_count += old_cluster.dbarr.dbs[dbnum].slot_arr.nslots;
+
+ return slot_count;
+}

IMO it is better to remove the version-checking side-effect here. Do
the version checks from the "check" functions where this is called
from. Also removing the check from here gives the ability to output
more useful messages -- e.g. review comment #11

======
src/bin/pg_upgrade/pg_upgrade.c

10. File-level comment

Add a detailed explanation about the limitation in the file-level
comment. See review comment #3 for details.

~~~

11.
+ /*
+ * Create logical replication slots.
+ *
+ * Note: This must be done after doing the pg_resetwal command because
+ * pg_resetwal would remove required WALs.
+ */
+ if (count_logical_slots())
+ {
+ start_postmaster(&new_cluster, true);
+ create_logical_replication_slots();
+ stop_postmaster(false);
+ }
+

IMO it is better to do the explicit version checking here, instead of
relying on a side-effect within the count_logical_slots() function.

SUGGESTION #1

/* Logical replication slot upgrade only supported for old_cluster >= PG17 */
if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
{
if (count_logical_slots())
{
start_postmaster(&new_cluster, true);
create_logical_replication_slots();
stop_postmaster(false);
}
}

AND...

By doing this, you will be able to provide more useful output here like this:

SUGGESTION #2 (my preferred)

if (count_logical_slots())
{
if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
{
pg_log(PG_WARNING,
"\nWARNING: This utility can only upgrade logical
replication slots present in PostgreSQL version %s and later.",
"17.0");
}
else
{
start_postmaster(&new_cluster, true);
create_logical_replication_slots();
stop_postmaster(false);
}
}

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2023-08-25 08:45:28 Re: Synchronizing slots from primary to standby
Previous Message Jim Jones 2023-08-25 08:27:08 Re: [PATCH] Add XMLText function (SQL/XML X038)