RE: [PoC] pg_upgrade: allow to upgrade publisher node

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Peter Smith' <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Subject: RE: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-10-19 10:42:50
Message-ID: TYCPR01MB587007EA2F9AB92F0E1F5957F5D4A@TYCPR01MB5870.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Peter,

Thanks for reviewing! PSA new version.

> ======
> src/bin/pg_upgrade/t/003_upgrade_logical_replication_slots.pl
>
> 1.
> + # 2. max_replication_slots is set to smaller than the number of slots (2)
> + # present on the old cluster
>
> SUGGESTION
> 2. Set 'max_replication_slots' to be less than the number of slots (2)
> present on the old cluster.

Fixed.

> 2.
> + # Set max_replication_slots to the same value as the number of slots. Both
> + # of slots will be used for subsequent tests.
>
> SUGGESTION
> Set 'max_replication_slots' to match the number of slots (2) present
> on the old cluster.
> Both slots will be used for subsequent tests.

Fixed.

>
> 3.
> + # 3. Emit a non-transactional message. test_slot2 detects the message so
> + # that this slot will be also reported by upcoming pg_upgrade.
> + $old_publisher->safe_psql('postgres',
> + "SELECT count(*) FROM pg_logical_emit_message('false', 'prefix',
> 'This is a non-transactional message');"
> + );
>
> SUGGESTION
> 3. Emit a non-transactional message. This will cause test_slot2 to
> detect the unconsumed WAL record.

Fixed.

>
> 4.
> + # Preparations for the subsequent test:
> + # 1. Generate extra WAL records. At this point neither test_slot1 nor
> + # test_slot2 has consumed them.
> + $old_publisher->start;
> + $old_publisher->safe_psql('postgres',
> + "CREATE TABLE tbl AS SELECT generate_series(1, 10) AS a;");
> +
> + # 2. Advance the slot test_slot2 up to the current WAL location, but
> + # test_slot1 still has unconsumed WAL records.
> + $old_publisher->safe_psql('postgres',
> + "SELECT pg_replication_slot_advance('test_slot2', NULL);");
> +
> + # 3. Emit a non-transactional message. test_slot2 detects the message so
> + # that this slot will be also reported by upcoming pg_upgrade.
> + $old_publisher->safe_psql('postgres',
> + "SELECT count(*) FROM pg_logical_emit_message('false', 'prefix',
> 'This is a non-transactional message');"
> + );
> +
> + $old_publisher->stop;
>
> All of the above are sequentially executed on the
> old_publisher->safe_psql, so consider if it is worth combining them
> all in a single call (keeping the comments 1,2,3 separate still)
>
> For example,
>
> $old_publisher->start;
> $old_publisher->safe_psql('postgres', qq[
> CREATE TABLE tbl AS SELECT generate_series(1, 10) AS a;
> SELECT pg_replication_slot_advance('test_slot2', NULL);
> SELECT count(*) FROM pg_logical_emit_message('false', 'prefix',
> 'This is a non-transactional message');
> ]);
> $old_publisher->stop;

Fixed.

>
> 5.
> + # Clean up
> + $subscriber->stop();
> + $new_publisher->stop();
>
> Should this also drop the 'test_slot1' and 'test_slot2'?

'test_slot1' and 'test_slot2' have already been removed while preparing in
"Successful upgrade" case. Also, I don't think objects have to be removed at the
end. It is tested by other parts, and it may make the test more difficult to
debug, if there are some failures.

> 6.
> +# Verify that logical replication slots cannot be migrated. This function
> +# will be executed when the old cluster is PG16 and prior.
> +sub test_upgrade_from_pre_PG17
> +{
> + my ($old_publisher, $new_publisher, $mode) = @_;
> +
> + my $oldbindir = $old_publisher->config_data('--bindir');
> + my $newbindir = $new_publisher->config_data('--bindir');
>
> SUGGESTION (let's not mention lots of different numbers; just refer to 17)
> This function will be executed when the old cluster version is prior to PG17.

Fixed.

> 7.
> + # Actual run, successful upgrade is expected
> + command_ok(
> + [
> + 'pg_upgrade', '--no-sync',
> + '-d', $old_publisher->data_dir,
> + '-D', $new_publisher->data_dir,
> + '-b', $oldbindir,
> + '-B', $newbindir,
> + '-s', $new_publisher->host,
> + '-p', $old_publisher->port,
> + '-P', $new_publisher->port,
> + $mode,
> + ],
> + 'run of pg_upgrade of old cluster');
> +
> + ok( !-d $new_publisher->data_dir . "/pg_upgrade_output.d",
> + "pg_upgrade_output.d/ removed after pg_upgrade success");
>
> 7a.
> The comment is wrong?
>
> SUGGESTION
> # pg_upgrade should NOT be successful

No, pg_uprade will success but no logical replication slots are migrated.
Comments docs were added.

> 7b.
> There is a blank line here before the ok() function, but in the other
> tests, there was none. Better to be consistent.

Removed.

> 8.
> + # Clean up
> + $new_publisher->stop();
>
> Should this also drop the 'test_slot'?

I don't think so. Please see above.

>
> 9.
> +# The TAP Cluster.pm assigns default 'max_wal_senders' and 'max_connections'
> to
> +# the same value (10) but PG12 and prior considered max_walsenders as a
> subset
> +# of max_connections, so setting the same value will fail.
> +if ($old_publisher->pg_version->major < 12)
> +{
> + $old_publisher->append_conf(
> + 'postgresql.conf', qq[
> + max_wal_senders = 5
> + max_connections = 10
> + ]);
> +}
>
> If the comment is correct, then PG12 *and* prior, should be testing
> "<= 12", not "< 12". right?

I analyzed more and I was wrong - we must set GUCs here only for PG9.6-.
Regarding PG11 and PG10, the corresponding constructor will be chosen in new() [a],
and these instance will set max_wal_senders to 5 [b].
As for PG9.6-, the related package has not been defined yet so that such a
workaround will not be used. So we must set manually.

Actually, the part will be not needed when Cluster.pm supports PG9.6-. If needed
we can start another thread and support them. For now the case is handled ad-hoc.

> 10.
> +# Test according to the major version of the old cluster.
> +# Upgrading logical replication slots has been supported only since PG17.
> +if ($old_publisher->pg_version->major >= 17)
>
> This comment seems wrong IMO. I think we always running the latest
> version of pg_upgrade so slot migration is always "supported" from now
> on. IIUC you intended this comment to be saying something about the
> old_publisher slots.
>
> BEFORE
> Upgrading logical replication slots has been supported only since PG17.
>
> SUGGESTION
> Upgrading logical replication slots from versions older than PG17 is
> not supported.

Fixed.

[a]:
```
# Use a subclass as defined below (or elsewhere) if this version
# isn't fully compatible. Warn if the version is too old and thus we don't
# have a subclass of this class.
if (ref $ver && $ver < $min_compat)
{
my $maj = $ver->major(separator => '_');
my $subclass = $class . "::V_$maj";
if ($subclass->isa($class))
{
bless $node, $subclass;
}
```

[b]:
```
sub init
{
my ($self, %params) = @_;
$self->SUPER::init(%params);
$self->adjust_conf('postgresql.conf', 'max_wal_senders',
$params{allows_streaming} ? 5 : 0);
}
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v53-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch application/octet-stream 47.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2023-10-19 10:43:57 RE: [PoC] pg_upgrade: allow to upgrade publisher node
Previous Message Daniel Gustafsson 2023-10-19 10:11:13 Re: Special-case executor expression steps for common combinations