RE: [PoC] pg_upgrade: allow to upgrade publisher node

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Peter Smith' <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Subject: RE: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-10-18 09:25:38
Message-ID: TYCPR01MB5870EBEBC89F5224F6B3788CF5D5A@TYCPR01MB5870.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Peter,

Thank you for reviewing! PSA new version.
Note that 0001 and 0002 are combined into one patch.

> Here are some review comments for v51-0001
>
> ======
> src/bin/pg_upgrade/check.c
>
> 0.
> +check_old_cluster_for_valid_slots(bool live_check)
> +{
> + char output_path[MAXPGPATH];
> + FILE *script = NULL;
> +
> + prep_status("Checking for valid logical replication slots");
> +
> + snprintf(output_path, sizeof(output_path), "%s/%s",
> + log_opts.basedir,
> + "invalid_logical_relication_slots.txt");
>
> 0a
> typo /invalid_logical_relication_slots/invalid_logical_replication_slots/

Fixed.

> 0b.
> Since the non-upgradable slots are not strictly "invalid", is this an
> appropriate filename for the bad ones?
>
> But I don't have very good alternatives. Maybe:
> - non_upgradable_logical_replication_slots.txt
> - problem_logical_replication_slots.txt

Per discussion [1], I kept current style.

> src/bin/pg_upgrade/t/003_upgrade_logical_replication_slots.pl
>
> 1.
> +# ------------------------------
> +# TEST: Confirm pg_upgrade fails when wrong GUC is set on new cluster
> +#
> +# There are two requirements for GUCs - wal_level and max_replication_slots,
> +# but only max_replication_slots will be tested here. This is because to
> +# reduce the execution time of the test.
>
>
> SUGGESTION
> # TEST: Confirm pg_upgrade fails when the new cluster has wrong GUC values.
> #
> # Two GUCs are required - 'wal_level' and 'max_replication_slots' - but to
> # reduce the test execution time, only 'max_replication_slots' is tested here.

First part was fixed. Second part was removed per [1].

> 2.
> +# Preparations for the subsequent test:
> +# 1. Create two slots on the old cluster
> +$old_publisher->start;
> +$old_publisher->safe_psql('postgres',
> + "SELECT pg_create_logical_replication_slot('test_slot1',
> 'test_decoding', false, true);"
> +);
> +$old_publisher->safe_psql('postgres',
> + "SELECT pg_create_logical_replication_slot('test_slot2',
> 'test_decoding', false, true);"
> +);
>
>
> Can't you combine those SQL in the same $old_publisher->safe_psql.

Combined.

> 3.
> +# Clean up
> +rmtree($new_publisher->data_dir . "/pg_upgrade_output.d");
> +# Set max_replication_slots to the same value as the number of slots. Both of
> +# slots will be used for subsequent tests.
> +$new_publisher->append_conf('postgresql.conf', "max_replication_slots = 1");
>
> The code doesn't seem to match the comment - is this correct? The
> old_publisher created 2 slots, so why are you setting new_publisher
> "max_replication_slots = 1" again?

Fixed to "max_replication_slots = 2" Note that previous test worked well because
GUC checking on new cluster is done after checking the status of slots.

> 4.
> +# Preparations for the subsequent test:
> +# 1. Generate extra WAL records. Because these WAL records do not get
> consumed
> +# it will cause the upcoming pg_upgrade test to fail.
> +$old_publisher->start;
> +$old_publisher->safe_psql('postgres',
> + "CREATE TABLE tbl AS SELECT generate_series(1, 10) AS a;");
> +
> +# 2. Advance the slot test_slot2 up to the current WAL location
> +$old_publisher->safe_psql('postgres',
> + "SELECT pg_replication_slot_advance('test_slot2', NULL);");
> +
> +# 3. Emit a non-transactional message. test_slot2 detects the message so that
> +# this slot will be also reported by upcoming pg_upgrade.
> +$old_publisher->safe_psql('postgres',
> + "SELECT count(*) FROM pg_logical_emit_message('false', 'prefix',
> 'This is a non-transactional message');"
> +);
>
>
> I felt this test would be clearer if you emphasised the state of the
> test_slot1 also. e.g.
>
> 4a.
> BEFORE
> +# 1. Generate extra WAL records. Because these WAL records do not get
> consumed
> +# it will cause the upcoming pg_upgrade test to fail.
>
> SUGGESTION
> # 1. Generate extra WAL records. At this point neither test_slot1 nor test_slot2
> # has consumed them.

Fixed.

> 4b.
> BEFORE
> +# 2. Advance the slot test_slot2 up to the current WAL location
>
> SUGGESTION
> # 2. Advance the slot test_slot2 up to the current WAL location, but test_slot2
> # still has unconsumed WAL records.

IIUC, test_slot2 is caught up by pg_replication_slot_advance('test_slot2'). I think
"but test_slot1 still has unconsumed WAL records." is appropriate. Fixed.

> 5.
> +# pg_upgrade will fail because the slot still has unconsumed WAL records
> +command_checks_all(
>
> /because the slot still has/because there are slots still having/

Fixed.

> 6.
> + [qr//],
> + 'run of pg_upgrade of old cluster with slot having unconsumed WAL records'
> +);
>
> /slot/slots/

Fixed.

> 7.
> +# And check the content. Both of slots must be reported that they have
> +# unconsumed WALs after confirmed_flush_lsn.
>
> SUGGESTION
> # Check the file content. Both slots should be reporting that they have
> # unconsumed WAL records.

Fixed.

>
> 8.
> +# Preparations for the subsequent test:
> +# 1. Setup logical replication
> +my $old_connstr = $old_publisher->connstr . ' dbname=postgres';
> +
> +$old_publisher->start;
> +
> +$old_publisher->safe_psql('postgres',
> + "SELECT * FROM pg_drop_replication_slot('test_slot1');");
> +$old_publisher->safe_psql('postgres',
> + "SELECT * FROM pg_drop_replication_slot('test_slot2');");
> +
> +$old_publisher->safe_psql('postgres',
> + "CREATE PUBLICATION regress_pub FOR ALL TABLES;");
>
>
> 8a.
> /Setup logical replication/Setup logical replication (first, cleanup
> slots from the previous tests)/

Fixed.

> 8b.
> Can't you combine all those SQL in the same $old_publisher->safe_psql.

Combined.

> 9.
> +
> +# Actual run, successful upgrade is expected
> +command_ok(
> + [
> + 'pg_upgrade', '--no-sync',
> + '-d', $old_publisher->data_dir,
> + '-D', $new_publisher->data_dir,
> + '-b', $bindir,
> + '-B', $bindir,
> + '-s', $new_publisher->host,
> + '-p', $old_publisher->port,
> + '-P', $new_publisher->port,
> + $mode,
> + ],
> + 'run of pg_upgrade of old cluster');
>
> Now that the "Dry run" part is removed, it seems unnecessary to say
> "Actual run" for this part.
>
>
> SUGGESTION
> # pg_upgrade should be successful.

Fixed.

[1]: https://www.postgresql.org/message-id/CAA4eK1%2BAHSWPs2_jn%3DftJKRqz-NXU6o%3DrPQ3f%3DH-gcPsgpPFrw%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v52-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch application/octet-stream 48.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2023-10-18 09:27:09 RE: [PoC] pg_upgrade: allow to upgrade publisher node
Previous Message Jelte Fennema 2023-10-18 09:07:00 Re: run pgindent on a regular basis / scripted manner