RE: [PoC] pg_upgrade: allow to upgrade publisher node

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Bharath Rupireddy' <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: RE: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-09-23 04:48:19
Message-ID: TYAPR01MB58666936A0DB0EEDCC929CEEF5FEA@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Bharath,

Again, thank you for reviewing! Here is a new version patch.

> 1.
> +{ oid => '8046', descr => 'for use by pg_upgrade',
> + proname => 'binary_upgrade_validate_wal_records', proisstrict => 'f',
> + provolatile => 'v', proparallel => 'u', prorettype => 'bool',
>
> + if (PG_ARGISNULL(0))
> + elog(ERROR, "null argument to
> binary_upgrade_validate_wal_records is not allowed");
>
> Can proisstrict => 'f' be removed so that there's no need for explicit
> PG_ARGISNULL check? Any specific reason to keep it?

Theoretically it could be, but I was not sure. I think you wanted us to follow
specs of pg_walinspect functions, but it is just a upgrade function. Normally
users cannot call it. Also, as Amit said [1], the caller must consider the
special case. Currently the function returns false at that time, we can change
more appropriate style later.

> And, the before the ISNULL check the arg is read, which isn't good.

Right, fixed.

> 2.
> +Datum
> +binary_upgrade_validate_wal_records(PG_FUNCTION_ARGS)
>
> The function name looks too generic in the sense that it validates WAL
> records for correctness/corruption, but it is not. Can it be something
> like binary_upgrade_{check_for_wal_logical_end,
> check_for_logical_end_of_wal} or such?

Per discussion [2], changed to binary_upgrade_validate_wal_logical_end.

> 3.
> + /* Quick exit if the given lsn is larger than current one */
> + if (start_lsn >= GetFlushRecPtr(NULL))
> + PG_RETURN_BOOL(false);
> +
>
> An LSN that doesn't exists yet is an error IMO, may be an error better here?

We think that the invalid slots should be listed at the end, so basically we do
not want to error out. This would be also changed if there are better opinions.

> 4.
> + * This function is used to verify that there are no WAL records (except some
> + * types) after confirmed_flush_lsn of logical slots, which means all the
> + * changes were replicated to the subscriber. There is a possibility that some
> + * WALs are inserted during upgrade, so such types would be ignored.
> + *
>
> This comment before the function better be at the callsite of the
> function, because as far as this function is concerned, it checks if
> there are any WAL records that are not "certain" types after the given
> LSN, it doesn't know logical slots or confirmed_flush_lsn or such.

Hmm, I think it is better to do the reverse, because otherwise we need to mention
the same explanation at other caller of the function if any. So, I have
adjusted the comments atop and at caller. Thought?

> 8.
> + if (nslots_on_new)
> + pg_fatal("expected 0 logical replication slots but found %d",
> + nslots_on_new);
>
> How about "New cluster database is containing logical replication
> slots", note that the some of the fatal messages are starting with an
> upper-case letter.

I did not use your suggestion, but changed to upper-case.
Actually, the uppper-case rule is broken even in the file. Here I regarded
this sentence as hint message.

> 9.
> + res = executeQueryOrDie(conn, "SHOW wal_level;");
> + res = executeQueryOrDie(conn, "SHOW max_replication_slots;");
>
> Instead of 2 queries to determine required parameters, isn't it better
> with a single query like the following?
>
> select setting from pg_settings where name in ('wal_level',
> 'max_replication_slots') order by name;

Modified, but use ORDER BY ... DESC. This come from a previous comment [3].

>
> 12.
> +extern XLogReaderState *InitXLogReaderState(XLogRecPtr lsn);
> +extern XLogRecord *ReadNextXLogRecord(XLogReaderState *xlogreader);
> +
>
> Why not these functions be defined in xlogreader.h with elog/ereport
> in #ifndef FRONTEND #endif blocks? IMO, xlogreader.h seems right
> location for these functions.

I checked comments atop both files, and xlogreader.h seems better. Fixed.

> 13.
> +LogicalReplicationSlotInfo
>
> Where is this structure defined?

Opps, removed.

[1]: https://www.postgresql.org/message-id/CAA4eK1LxPDeSkTttEAG2MPEWO%3D83vQe_Bja9F4QcCjVn%3DWt9rA%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAA4eK1L9oJmdxprFR3oob5KLpHUnkJAt5Le4woxO3wHz-SZ%2BTA%40mail.gmail.com
[3]: https://www.postgresql.org/message-id/CAA4eK1LHH_%3DwbxsEn20%3DW%2Bqz1193OqFj-vvJ-u0uHLMmwLHbRw%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v44-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch application/octet-stream 52.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2023-09-23 04:49:43 RE: [PoC] pg_upgrade: allow to upgrade publisher node
Previous Message Peter Geoghegan 2023-09-23 03:17:16 nbtree's ScalarArrayOp array mark/restore code appears to be buggy