RE: [PoC] pg_upgrade: allow to upgrade publisher node

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Peter Smith' <smithpb2250(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: RE: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-09-12 07:04:23
Message-ID: TYAPR01MB58667F90E34874E6680A4DABF5F1A@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Peter,

Thank you for reviewing! Before posting new patch set, I want to respond some
comments.

>
> ======
> 1. GENERAL -- Cluster Terminology
>
> This is not really a problem of your patch, but during message review,
> I noticed the terms old/new cluster VERSUS source/target cluster and
> both were used many times:
>
> For example.
> ".*new clusmter --> 44 occurences
> ".*old cluster --> 21 occurences
> ".*source cluster --> 6 occurences
> ".*target cluster --> 12 occurences
>
> Perhaps there should be a new thread/patch to use consistent terms.
>
> Thoughts?

I preferred the term new/old because I could not found the term source/target
in the documentation for the pg_upgrade. (IIUC I used new/old in my patch).
Anyway, it should be discussed in another thread.

> 2. GENERAL - Error message cases
>
> Just FYI, there are many inconsistent capitalising in these patch
> messages, but then the same is also true for the HEAD code. It's a bit
> messy, but generally, I think your capitalisation was aligned with
> what I saw in HEAD, so I didn't comment anywhere about it.

Yeah, the rule is broken even in HEAD. I determined a rule in [1], which seems
consistent with other parts in the file.
Michael kindly told the error message formatting [2], and basically it follows the
style. (IIUC pg_fatal("Your installation...") is followed the
"Detail and hint messages" rule.)

> ======
> src/bin/pg_upgrade/info.c
>
> 7. get_db_rel_and_slot_infos
>
> void
> get_db_rel_and_slot_infos(ClusterInfo *cluster)
> {
> int dbnum;
>
> if (cluster->dbarr.dbs != NULL)
> free_db_and_rel_infos(&cluster->dbarr);
>
> ~
>
> Judging from the HEAD code this function was intended to be reentrant
> -- e.g. it does cleanup code free_db_and_rel_infos in case there was
> something there from before.
>
> IIUC there is no such cleanup for the slot_arr. I forget why this was
> removed. Sure, you might be able to survive the memory leaks, but
> choosing NOT to clean up the slot_arr seems to contradict the
> intention of HEAD calling free_db_and_rel_infos.

free_db_and_rel_infos() is called if get_db_rel_and_slot_infos() is called
several times for the same cluster. Followings are callers:

* check_and_dump_old_cluster(), target is old_cluster
* check_new_cluster(), target is new_cluster
* create_new_objects(), target is new_cluster

And we requires that new_cluster must not have logical slots, this restriction
cannot ease. Therefore, there are no possibilities slot_arr must be free()'d,
so that I removed (See similar discussion [3]). I think we should not add no-op codes.
In old version there was an Assert() instead, but removed based on the comment [4].

> 8. get_db_infos
>
> I noticed the pg_malloc0 is reverted in this function.
>
> - dbinfos = (DbInfo *) pg_malloc(sizeof(DbInfo) * ntups);
> + dbinfos = (DbInfo *) pg_malloc0(sizeof(DbInfo) * ntups);
>
> IMO it is better to do pg_malloc0 here.
>
> Sure, everything probably works OK for the current code,

Yes, it works well. No one checks slot_arr before
get_old_cluster_logical_slot_infos(). In the old version, it was checked like
(slot_arr == NULL) infree_db_and_rel_infos(), but removed.

> but it seems
> unnecessarily risky to assume that functions will forever be called in
> a specific order. AFAICT if someone (e.g. for debugging) calls
> count_old_cluster_logical_slots() or calls print_slot_infos() then the
> behaviour is undefined because slot_arr.nslots remains uninitialized.

Hmm, I do not think such assumption is needed. In the current code pg_malloc() is
used in get_db_infos(), so there is a possibility that print_rel_infos() is
executed for debugging. The behavior is undefined - this is same as you said,
and code has been alive. Based on that I think we can accept the risk and
reduce operations instead. If you knew other example, please share here...

[1]: https://www.postgresql.org/message-id/TYAPR01MB586642D33208D190F67CDD7BF5F2A%40TYAPR01MB5866.jpnprd01.prod.outlook.com
[2]: https://www.postgresql.org/docs/devel/error-style-guide.html#ERROR-STYLE-GUIDE-GRAMMAR-PUNCTUATION
[3]: https://www.postgresql.org/message-id/TYAPR01MB5866732D30ABB976992BDECCF5789%40TYAPR01MB5866.jpnprd01.prod.outlook.com
[4]: https://www.postgresql.org/message-id/OS0PR01MB5716670FE547BA87FDEF895E94EDA%40OS0PR01MB5716.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message 쿼리트릭스 2023-09-12 07:27:18 Re: [ psql - review request ] review request for \d+ tablename, \d+ indexname indenting
Previous Message Peter Eisentraut 2023-09-12 06:27:35 Re: Make --help output fit within 80 columns per line