Re: Documentation to upgrade logical replication cluster

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Documentation to upgrade logical replication cluster
Date: 2024-01-30 10:14:50
Message-ID: CALDaNm1G8pGY+W=fbMj3PPbpwUqexxQh3b8J=ON3jY0Nx4Ga9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 29 Jan 2024 at 16:01, Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Mon, Jan 29, 2024 at 10:10 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > Thanks for the comments, the attached v5 version patch has the changes
> > for the same.
>
> Thanks for working on this. Here are some comments on the v5 patch:
>
> 1.
> + <para>
> + Migration of logical replication clusters is possible only when all the
> + members of the old logical replication clusters are version 17.0 or later.
>
> Perhaps define what logical replication cluster is either in glossary
> or within a parenthesis next to the first use in the docs? This will
> help developers understand it better and will not confuse them with
> postgres cluster. I see it being used for the first time in code
> comments 9a17be1e2, but this patch uses it for the first time in the
> docs.

I have added it in glossary.

> 2.
> + Before reading this section, refer <xref linkend="pgupgrade"/> page for
> + more details about pg_upgrade.
> + </para>
>
> This looks extraneous, we can just link to pg_upgrade on the first use
> of pg_upgrade, change the following
>
> + <para>
> + <application>pg_upgrade</application> attempts to migrate logical
> + slots. This helps avoid the need for manually defining the same
>
> to
>
> + <para>
> + <xref linkend="pgupgrade"/> attempts to migrate logical
> + slots. This helps avoid the need for manually defining the same

Modified

> 3.
> + transactional, the user is advised to take backups. Backups can be taken
> + as described in <xref linkend="backup-base-backup"/>.
> + </para>
>
> How about simplifying the above to "the user is advised to take
> backups as described in <xref linkend="backup-base-backup"/>" instead
> of two statements?

Modified

> 4.
> subscription is temporarily disabled, by executing
> + <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION
> ... DISABLE</command></link>.
> + Re-enable the subscription after the upgrade.
> + </para>
>
> Is it to avoid repeated failures of logical replication apply workers
> on the subscribers? Isn't it good to say why subscription needs to be
> disabled?

Added it

> 5.
> + <para>
> + There are some prerequisites for <application>pg_upgrade</application> to
> + be able to upgrade the logical slots. If these are not met an error
> + will be reported.
> + </para>
>
> I think it's better to be "Following are prerequisites for
> <application>pg_upgrade</application> to.."?

Modified

> 6.
> + <listitem>
> + <para>
> + The old cluster has replicated all the transactions and logical decoding
> + messages to subscribers.
> + </para>
>
> I think it's better to be "The old cluster must have replicated all
> the transactions and ...."?

Modified

> 7.
> + <para>
> + The new cluster must not have permanent logical slots, i.e.,
> + there must be no slots where
> + <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>temporary</structfield>
> + is <literal>false</literal>.
>
> I think we better specify a full SQL query as opposed to just
> specifying one output column and the view name.
>
> <para>
> The new cluster must not have permanent logical slots, i.e., a query like:
> <programlisting>
> SELECT count(*) FROM pg_replication_slots WHERE slot_type = 'logical'
> AND temporary IS false;
> </programlisting>
> must return 0.
> </para>

Modified

> 8.
> + If the old cluster is prior to 17.0, then no slots on the primary are
> + copied to the new standby, so all the slots must be recreated manually.
> + If the old cluster is 17.0 or later, then only logical slots on the
>
> I think it's better to say "version 17.0" instead of just "17.0".

Modified

> 9.
> + primary are copied to the new standby, but other slots on the
> old standby
>
> "but other slots on the old standby" - is it slots on the old standby
> or old cluster?
>
> I think it's the other way around: the old cluster needs to be
> replaced with the old standby in the newly added paragraph.

Modified it to old primary as we upgrade primary and do a rsync

> 10.
> Change
> + primary are copied to the new standby, but other slots on the
> old standby
> + are not copied so must be recreated manually.
>
> to
>
> + primary are copied to the new standby, but other slots on the
> old standby
> + are not copied, so must be recreated manually.

Modified

> 11.
> + <note>
> + <para>
> + The logical replication restrictions apply to logical replication cluster
> + upgrades also. See <xref linkend="logical-replication-restrictions"/> for
> + the details of logical replication restrictions.
> + </para>
>
> How about just say "See <xref
> linkend="logical-replication-restrictions"/> for details." instead of
> using logical replication restrictions more than once in the same
> para?

Modified

> 12.
> + <para>
> + The prerequisites of publisher upgrade apply to logical replication
> + cluster upgrades also. See <xref linkend="prepare-publisher-upgrades"/>
> + for the details of publisher upgrade prerequisites.
>
> How about just say "See <xref linkend="prepare-publisher-upgrades"/>
> for details." instead of using publisher upgrade prerequisites more
> than once in the same para?

Modified

> 13.
> + <para>
> + The prerequisites of subscriber upgrade apply to logical replication
> + cluster upgrades also. See <xref linkend="prepare-subscriber-upgrades"/>
> + for the details of subscriber upgrade prerequisites.
> + </para>
>
> How about just say "See <xref linkend="prepare-subscriber-upgrades"/>
> for details." instead of using subscriber upgrade prerequisites more
> than once in the same para?

Modified

> 14.
> + Upgrading logical replication cluster requires multiple steps to be
> + performed on various nodes. Because not all operations are
>
> Per comment #1, defining logical replication clusters and nodes helps
> clearly distinguish. For instance, one can get confused with the
> various terms in hand - postgres cluster, logical replication cluster,
> node etc.

I have added "logical replication clusters". I felt no need to add
node as it is not a new terminology. It is already being used in many
places like in [1], [2] & [3]

> 15.
> + two subscriptions <literal>sub1_node1_node2</literal> and
> + <literal>sub2_node1_node2</literal> which are subscribing the changes
> + from <literal>node1</literal>.
>
> Why confluse with subsription names by including node1 and node2 in
> it? We are not creating subscriptions from node1 to node2, are we? I'd
> recommend using simplified names like mysub1, mysub2 like elsewhere in
> the documentation.

I have used the name sub1_node1_node to indicate it is subscribing
changes from node1 to node2. I felt this is self explainatory names.

> 16.
> + Let's say publisher is in <literal>node1</literal> and subscriber is
> + in <literal>node2</literal>.
>
> How about saying "publisher is in a database cluster named
> <literal>node1</literal> and subscriber is in database cluster named
> <literal>node2</literal>"? I think using this terminology helps.

I felt existing is ok as similar is used in [2] & [3]

> 17.
> + refer to <xref linkend="logical-replication-upgrade"/> for details.
> + </para>
> + </note>
>
> IMHO, it could have been better if steps to upgrade the logical
> replication cluster is specified in pgupgrade.sgml as opposed to
> logical-replication.sgml. Because, upgrading logical replication
> cluster is a sub-section for pg_upgrade.

As the content for logical replication is more, I felt it is better to
keep it here and also we have given a link to this in the pg_upgrade
page. I did not want the upgrade page to become bulky because of the
logical replication upgrade section.

> 18.
> + <para>
> + The steps to upgrade the following logical replication clusters are
> + detailed below:
> + <itemizedlist>
> + <listitem>
> + <para>
> + Follow the steps specified in
>
> I think we can talk about what advantages upgrading logical
> replication clusters brings in. We can say that the pg_upgrade makes
> it possible 1) to re-use the logical replication slots post-upgrade,
> 2) to re-use the subscribers i.e. now it's not required to re-create
> all the logical subscribers after the upgrade, so no initial table
> sync, no creation of new clusters for subscribers etc.

I felt this is self explanatory, no need to mention about the
complexity involved in the manual steps involved. As the same is not
mentioned in case of streaming replication too at [4].

> 19. I think we can talk about the possible gotchas i.e. the things
> that can go wrong while performing any of the prescribed steps. What
> happens if the slot the pg_upgrade interrupts after it upgraded a few
> of the replication slots or if some of the prerequisites are not met
> etc.?

There is the note below when we run pg_upgrade:
"If pg_upgrade fails after this point, you must re-initdb the new
cluster before continuing."
I felt this is kind of self explanatory. Also the pre-requisite
mentions clearly about the configurations that must be set before
upgrade is run. So I felt the existing information was enough.

Thanks for the comment, the attached v6 version patch has the changes
for the same.

[1] - https://www.postgresql.org/docs/devel/logical-replication.html
[2] - https://www.postgresql.org/docs/devel/logical-replication-publication.html
[3] - https://www.postgresql.org/docs/devel/logical-replication-subscription.html
[4] - https://www.postgresql.org/docs/devel/pgupgrade.html

Regards,
Vignesh

Attachment Content-Type Size
v6-0001-Documentation-for-upgrading-logical-replication-c.patch text/x-patch 35.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2024-01-30 10:18:32 Re: Report planning memory in EXPLAIN ANALYZE
Previous Message John Naylor 2024-01-30 10:04:20 Re: Change GUC hashtable to use simplehash?