RE: [PoC] pg_upgrade: allow to upgrade publisher node

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Peter Smith' <smithpb2250(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: RE: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-09-07 12:24:23
Message-ID: TYAPR01MB5866798BC54E743D50E219CCF5EEA@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Peter,

Thank you for reviewing! PSA new version.

> ======
> src/bin/pg_upgrade/check.c
>
> 1. check_new_cluster_logical_replication_slots
>
> + res = executeQueryOrDie(conn, "SHOW max_replication_slots;");
> + max_replication_slots = atoi(PQgetvalue(res, 0, 0));
> +
> + if (PQntuples(res) != 1)
> + pg_fatal("could not determine max_replication_slots");
>
> Shouldn't the PQntuples check be *before* the PQgetvalue and
> assignment to max_replication_slots?

Right, fixed. Also, the checking was added at the first query.

> 2. check_new_cluster_logical_replication_slots
>
> + res = executeQueryOrDie(conn, "SHOW wal_level;");
> + wal_level = PQgetvalue(res, 0, 0);
> +
> + if (PQntuples(res) != 1)
> + pg_fatal("could not determine wal_level");
>
> Shouldn't the PQntuples check be *before* the PQgetvalue and
> assignment to wal_level?

Fixed.

> 3. check_old_cluster_for_valid_slots
>
> I saw that similar code with scripts like this is doing PG_REPORT:
>
> pg_log(PG_REPORT, "fatal");
>
> but that PG_REPORT is missing from this function.

Added.

> src/bin/pg_upgrade/function.c
>
> 4. get_loadable_libraries
>
> @@ -42,11 +43,12 @@ library_name_compare(const void *p1, const void *p2)
> ((const LibraryInfo *) p2)->dbnum;
> }
>
> -
> /*
> * get_loadable_libraries()
>
> ~
>
> Removing that blank line (above this function) should not be included
> in the patch.

Restored the blank.

> 5. get_loadable_libraries
>
> + /*
> + * Allocate a memory for extensions and logical replication output
> + * plugins.
> + */
> + os_info.libraries = pg_malloc_array(LibraryInfo,
> + totaltups + count_old_cluster_logical_slots());
>
> /Allocate a memory/Allocate memory/

Fixed.

> 6. get_loadable_libraries
> + /*
> + * Store the name of output plugins as well. There is a possibility
> + * that duplicated plugins are set, but the consumer function
> + * check_loadable_libraries() will avoid checking the same library, so
> + * we do not have to consider their uniqueness here.
> + */
> + for (slotno = 0; slotno < slot_arr->nslots; slotno++)
>
> /Store the name/Store the names/

Fixed.

> src/bin/pg_upgrade/info.c
>
> 7. get_old_cluster_logical_slot_infos
>
> + i_slotname = PQfnumber(res, "slot_name");
> + i_plugin = PQfnumber(res, "plugin");
> + i_twophase = PQfnumber(res, "two_phase");
> + i_caughtup = PQfnumber(res, "caughtup");
> + i_conflicting = PQfnumber(res, "conflicting");
> +
> + for (slotnum = 0; slotnum < num_slots; slotnum++)
> + {
> + LogicalSlotInfo *curr = &slotinfos[slotnum];
> +
> + curr->slotname = pg_strdup(PQgetvalue(res, slotnum, i_slotname));
> + curr->plugin = pg_strdup(PQgetvalue(res, slotnum, i_plugin));
> + curr->two_phase = (strcmp(PQgetvalue(res, slotnum, i_twophase), "t") == 0);
> + curr->caughtup = (strcmp(PQgetvalue(res, slotnum, i_caughtup), "t") == 0);
> + curr->conflicting = (strcmp(PQgetvalue(res, slotnum, i_conflicting),
> "t") == 0);
> + }
>
> Saying "tup" always looks like it should be something tuple-related.
> IMO it will be better to call all these "caught_up" instead of
> "caughtup":
>
> "caughtup" ==> "caught_up"
> i_caughtup ==> i_caught_up
> curr->caughtup ==> curr->caught_up

Fixed. The alias was also fixed.

> 8. print_slot_infos
>
> +static void
> +print_slot_infos(LogicalSlotInfoArr *slot_arr)
> +{
> + int slotnum;
> +
> + if (slot_arr->nslots > 1)
> + pg_log(PG_VERBOSE, "Logical replication slots within the database:");
> +
> + for (slotnum = 0; slotnum < slot_arr->nslots; slotnum++)
> + {
> + LogicalSlotInfo *slot_info = &slot_arr->slots[slotnum];
> +
> + pg_log(PG_VERBOSE, "slotname: \"%s\", plugin: \"%s\", two_phase: %d",
> + slot_info->slotname,
> + slot_info->plugin,
> + slot_info->two_phase);
> + }
> +}
>
> Although it makes no functional difference, it might be neater if the
> for loop is also within that "if (slot_arr->nslots > 1)" condition.

Hmm, but the point makes more differences between print_rel_infos() and
print_slot_infos(), I thought it should be similar. Instead, I added a quick
return. Thought?

> src/bin/pg_upgrade/pg_upgrade.h
>
> 9.
> +/*
> + * Structure to store logical replication slot information
> + */
> +typedef struct
> +{
> + char *slotname; /* slot name */
> + char *plugin; /* plugin */
> + bool two_phase; /* can the slot decode 2PC? */
> + bool caughtup; /* Is confirmed_flush_lsn the same as latest
> + * checkpoint LSN? */
> + bool conflicting; /* Is the slot usable? */
> +} LogicalSlotInfo;
>
> 9a.
> + bool caughtup; /* Is confirmed_flush_lsn the same as latest
> + * checkpoint LSN? */
>
> caughtup ==> caught_up

Fixed.

> 9b.
> + bool conflicting; /* Is the slot usable? */
>
> The field name has the opposite meaning of the wording of the comment.
> (e.g. it is usable when it is NOT conflicting, right?).
>
> Maybe there needs a better field name, or a better comment, or both.
> AFAICT from other code pg_fatal message 'conflicting' is always
> interpreted as 'lost' so maybe the field should be called that?

Changed to "is_lost", which is easy to understand the meaning.

Also, I fixed following points:

* Added a period to messages in check_new_cluster_logical_replication_slots(),
except the final line. According to other functions like check_new_cluster_is_empty(),
the period is ignored if the escape sequence is at the end.
* Removed the --check test because sometimes it failed on the windows machine.
I reported in another thread [1].
* Set max_slot_wal_keep_size to -1 when old cluster was started. Accordin to the
discussion [2], the setting is sufficient to supress the WAL removal.

[1]: https://www.postgresql.org/message-id/flat/TYAPR01MB586654E2D74B838021BE77CAF5EEA%40TYAPR01MB5866.jpnprd01.prod.outlook.com
[2]: https://www.postgresql.org/message-id/ZPl659a5hPDHPq9w%40paquier.xyz

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v33-0001-Persist-logical-slots-to-disk-during-a-shutdown-.patch application/octet-stream 9.8 KB
v33-0002-pg_upgrade-Allow-to-replicate-logical-replicatio.patch application/octet-stream 39.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo NAGATA 2023-09-07 12:56:51 Re: psql help message contains excessive indentations
Previous Message Daniel Gustafsson 2023-09-07 12:14:14 Re: add (void) cast inside advance_aggregates for function ExecEvalExprSwitchContext