RE: [PoC] pg_upgrade: allow to upgrade publisher node

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Peter Smith' <smithpb2250(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: RE: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-05-11 08:56:04
Message-ID: TYAPR01MB5866BD618DEE62AF1836E612F5749@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Peter,

Thanks for reviewing! New patch can be available at [1].

> 1. help
>
> printf(_(" --inserts dump data as INSERT
> commands, rather than COPY\n"));
> printf(_(" --load-via-partition-root load partitions via the
> root table\n"));
> + printf(_(" --logical-replication-slots-only\n"
> + " dump only logical replication slots,
> no schema or data\n"));
> printf(_(" --no-comments do not dump comments\n"));
>
> Now you removed the PG Docs for the internal pg_dump option based on
> my previous review comment (see [2]#1). So does it mean this "help"
> also be removed so this option will be completely invisible to the
> user? I am not sure, but if you do choose to remove this help then
> probably a comment should be added here to explain why it is
> deliberately not listed.

Removed from help and comments were added instead.

> 2. check_new_cluster
>
> Although you wrote "Added", I don't think my previous comment ([1]#8)
> was yet addressed.
>
> What I mean to say ask was: can that call to get_logical_slot_infos()
> be done later, only when you know that option was specified?
>
> e.g
>
> BEFORE
> get_logical_slot_infos(&new_cluster);
> ...
> if (user_opts.include_logical_slots)
> check_for_parameter_settings(&new_cluster);
>
> SUGGESTION
> if (user_opts.include_logical_slots)
> {
> get_logical_slot_infos(&new_cluster);
> check_for_parameter_settings(&new_cluster);
> }

Sorry for missing your comments. But I think get_logical_slot_infos() cannot be
executed later. In check_new_cluster_is_empty(), we must check not to exist any
replication slots on the new node because all of WALs will be truncated. Infos
related with slots are stored in get_logical_slot_infos(), so it must be executed
before check_new_cluster_is_empty(). Another possibility is to execute
check_for_parameter_settings() earlier, and I tried to do. The style seems little
bit strange, but it worked well. How do you think?

> 3. get_db_and_rel_infos
>
> > src/bin/pg_upgrade/info.c
> >
> > 11. get_db_and_rel_infos
> >
> > + {
> > get_rel_infos(cluster, &cluster->dbarr.dbs[dbnum]);
> >
> > + /*
> > + * Additionally, slot_arr must be initialized because they will be
> > + * checked later.
> > + */
> > + cluster->dbarr.dbs[dbnum].slot_arr.nslots = 0;
> > + cluster->dbarr.dbs[dbnum].slot_arr.slots = NULL;
> > + }
> >
> > 11a.
> > I think probably it would have been easier to just use 'pg_malloc0'
> > instead of 'pg_malloc' in the get_db_infos, then this code would not
> > be necessary.
> I was not sure whether it is OK to change like that because of the
> performance efficiency. But OK, fixed.
> > 11b.
> > BTW, shouldn't this function also be calling free_logical_slot_infos()
> > too? That will also have the same effect (initializing the slot_arr)
> > but without having to change anything else.
>
> ~
>
> Above is your reply ([2]11a). If you were not sure about the malloc0
> then I think the suggestion ([1]#12b) achieves the same thing and
> initializes those fields. You did not reply to 12b, so I wondered if
> you accidentally missed that point.

Sorry, this part is no longer needed. Please see below.

> 4. get_logical_slot_infos
>
> + for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
> + {
> + DbInfo *pDbInfo = &cluster->dbarr.dbs[dbnum];
> +
> + if (pDbInfo->slot_arr.slots)
> + free_logical_slot_infos(&pDbInfo->slot_arr);
>
> Maybe it is ok, but it seems unusual that this
> get_logical_slot_infos() is also doing a free. I didn't notice this
> same pattern with the other get_XXX functions. Why is it needed? Even
> if pDbInfo->slot_arr.slots was not NULL, is the information stale or
> will you just end up re-fetching the same info?

After considering more, I decided to remove the free function.

The reason why I did is that get_logical_slot_infos() for the new cluster is
called twice, one is for checking purpose in check_new_cluster() and
another is for updating the cluster info in create_logical_replication_slots().
At the first calling, we assume that logical slots do not exist on new node, but
even if the case a small memory are is allocated by pg_malloc(0).
(If there are some slots, it is not called twice.)
But I noticed that it can be avoided by adding the if-statement, so I did.

Additionally, the pg_malloc0() in get_db_and_rel_infos() is no more needed
because we do not have to check the un-initialized area.

> .../pg_upgrade/t/003_logical_replication_slots.pl
>
> 5.
> +# Preparations for the subsequent test. The case max_replication_slots is set
> +# to 0 is prohibit.
>
> /prohibit/prohibited/

Fixed.

[1]: https://www.postgresql.org/message-id/TYAPR01MB5866A3B91F56056A803B94DAF5749%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2023-05-11 09:09:28 Re: pg_upgrade - typo in verbose log
Previous Message Hayato Kuroda (Fujitsu) 2023-05-11 08:55:08 RE: [PoC] pg_upgrade: allow to upgrade publisher node