RE: [PoC] pg_upgrade: allow to upgrade publisher node

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: RE: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-10-24 06:02:21
Message-ID: TYAPR01MB586644F209335F2B35A01243F5DFA@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Bharath, Amit,

Thanks for reviewing! PSA new version.
I addressed comments which have not been claimed.

> On Mon, Oct 23, 2023 at 2:00 PM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > On Mon, Oct 23, 2023 at 11:10 AM Hayato Kuroda (Fujitsu)
> > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > >
> > > Thank you for reviewing! PSA new version.
> >
> > > > 6. A nit: how about is_decodable_txn or is_decodable_change or some
> > > > other instead of just a plain name processing_required?
> > > > + /* Do we need to process any change in 'fast_forward' mode? */
> > > > + bool processing_required;
> > >
> > > I preferred current one. Because not only decodable txn, non-txn change and
> > > empty transactions also be processed.
> >
> > Right. It's not the txn, but the change. processing_required seems too
> > generic IMV. A nit: is_change_decodable or something?
> >
>
> If we don't want to keep it generic then we should use something like
> 'contains_decodable_change'. 'is_change_decodable' could have suited
> here if we were checking a particular change.

I kept the name for now. How does Bharath think?

> > Thanks for the patch. Here are few comments on v56 patch:
> >
> > 1.
> > + *
> > + * Although this function is currently used only during pg_upgrade, there are
> > + * no reasons to restrict it, so IsBinaryUpgrade is not checked here.
> >
> > This comment isn't required IMV, because anyone looking at the code
> > and callsites can understand it.

Removed.

> > 2. A nit: IMV "This is a special purpose ..." statement seems redundant.
> > + *
> > + * This is a special purpose function to ensure that the given slot can be
> > + * upgraded without data loss.
> >
> > How about
> >
> > Verify that the given replication slot has consumed all the WAL changes.
> > If there's any decodable WAL record after the slot's
> > confirmed_flush_lsn, the slot's consumer will lose that data after the
> > slot is upgraded.
> > Returns true if there are no decodable WAL records after the
> > confirmed_flush_lsn. Otherwise false.
> >
>
> Personally, I find the current comment succinct and clear.

I kept current one.

> > 3.
> > + if (PG_ARGISNULL(0))
> > + elog(ERROR, "null argument to
> > binary_upgrade_validate_wal_records is not allowed");
> >
> > I can see the above style is referenced from
> > binary_upgrade_create_empty_extension, but IMV the following looks
> > better and latest (ereport is new style than elog)
> >
> > ereport(ERROR,
> > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > errmsg("replication slot name cannot be null")));
> >
>
> Do you have any theory for making elog to ereport? I am not completely
> sure but as this and related function is used internally, so using
> elog seems reasonable. Also, I find keeping it consistent with the
> existing error message is also reasonable. We can change both later
> together if we get a broader agreement.

I kept current style. elog() was used here because I regarded it as
"cannot happen" error. According to the doc [1], elog() is still used
for the purpose.

> > 4. The following comment seems frivolous, the code tells it all.
> > Please remove the comment.
> > +
> > + /* No need to check this slot, seek to new one */
> > + continue;

Removed.

> > 5. A typo - s/gets/Gets
> > + * gets the LogicalSlotInfos for all the logical replication slots of the

Replaced.

> > 6. An optimization in count_old_cluster_logical_slots(void): Turn
> > slot_count to a function static variable so that the for loop isn't
> > required every time because the slot count is prepared in
> > get_old_cluster_logical_slot_infos only once and won't change later
> > on. Do you see any problem with the following? This saves a few CPU
> > cycles when there are large number of replication slots.
> > {
> > static int slot_count = 0;
> > static bool first_time = true;
> >
> > if (first_time)
> > {
> > for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
> > slot_count += old_cluster.dbarr.dbs[dbnum].slot_arr.nslots;
> >
> > first_time = false;
> > }
> >
> > return slot_count;
> > }
> >
>
> This may not be a problem but this is also not a function that will be
> used frequently. I am not sure if adding such code optimizations is
> worth it.

Not addressed.

> > 7. A typo: s/slotname/slot name. "slot name" looks better in user
> > visible messages.
> > + pg_log(PG_VERBOSE, "slotname: \"%s\", plugin: \"%s\",
> two_phase: %s",
> >
>
> If we want to follow other parameters then we can even use slot_name.

Changed to slot_name.

Below part is replies for remained comments:

>8.
>+else
>+{
>+ test_upgrade_from_pre_PG17($old_publisher, $new_publisher,
>+ @pg_upgrade_cmd);
>+}
>Will this ever be tested in current TAP test framework? I mean, will
>the TAP test framework allow testing upgrades from one PG version to
>another PG version?

Yes, the TAP tester allow to do cross-version upgrade. According to
src/bin/pg_upgrade/TESTING file:

```
Testing an upgrade from a different PG version is also possible, and
provides a more thorough test that pg_upgrade does what it's meant for.
```

Below commands are an example of the test.

```
# test PG9.5 -> patched HEAD
$ oldinstall=/home/hayato/older/pg95 make check PROVE_TESTS='t/003_upgrade_logical_replication_slots.pl'
...
# +++ tap check in src/bin/pg_upgrade +++
t/003_upgrade_logical_replication_slots.pl .. ok
All tests successful.
Files=1, Tests=3, 11 wallclock secs ( 0.03 usr 0.01 sys + 2.78 cusr 1.08 csys = 3.90 CPU)
Result: PASS

# grep the output and find an evidence that cross-version check was done
$ cat tmp_check/log/regress_log_003_upgrade_logical_replication_slots | grep 'check the slot does not exist on new cluster'
[05:14:22.322](0.139s) ok 3 - check the slot does not exist on new cluster

```

>9. A nit: Can single quotes around variable names in the comments be
>removed just to be consistent?
>+ * We also skip decoding in 'fast_forward' mode. This check must be last
>+ /* Do we need to process any change in 'fast_forward' mode? */

Removed.

Also, based on a comment [2], the upgrade function was renamed to
'binary_upgrade_logical_slot_has_caught_up'.

[1]: https://www.postgresql.org/docs/devel/error-message-reporting.html
[2]: https://www.postgresql.org/message-id/CAA4eK1%2BYZP3j1H4ChhzSR23k6MPryW-cgGstyvqbek2CMJoHRA%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v57-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch application/octet-stream 48.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Junwang Zhao 2023-10-24 06:07:29 Re: Add trailing commas to enum definitions
Previous Message Kyotaro Horiguchi 2023-10-24 06:00:27 Re: pg_ctl start may return 0 even if the postmaster has been already started on Windows