Re: Synchronizing slots from primary to standby

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-02-08 06:37:35
Message-ID: CAHut+Pv-xvw_cVKkNpTTnHt8JoY_H0ShA_V31rmUZ-TbuJRaag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are some review comments for patch v80_2-0001.

======
Commit message

1.
We may also see the the slots invalidated and dropped on the standby if the
primary changes 'wal_level' to a level lower than logical. Changing the primary
'wal_level' to a level lower than logical is only possible if the logical slots
are removed on the primary server, so it's expected to see the slots
being removed
on the standby too (and re-created if they are re-created on the
primary server).

~

Typo /the the/the/

======
src/sgml/logicaldecoding.sgml

2.
+ <para>
+ The logical replication slots on the primary can be synchronized to
+ the hot standby by enabling <literal>failover</literal> during slot
+ creation (e.g. using the <literal>failover</literal> parameter of
+ <link linkend="pg-create-logical-replication-slot">
+ <function>pg_create_logical_replication_slot</function></link>, or
+ using the <link linkend="sql-createsubscription-params-with-failover">
+ <literal>failover</literal></link> option of the CREATE SUBSCRIPTION
+ command), and then calling <link linkend="pg-sync-replication-slots">
+ <function>pg_sync_replication_slots</function></link>
+ on the standby. For the synchronization to work, it is mandatory to
+ have a physical replication slot between the primary and the standby, and
+ <link linkend="guc-hot-standby-feedback"><varname>hot_standby_feedback</varname></link>
+ must be enabled on the standby. It is also necessary to specify a valid
+ <literal>dbname</literal> in the
+ <link linkend="guc-primary-conninfo"><varname>primary_conninfo</varname></link>.
+ </para>

Shveta previously asked: Regarding link to create-sub, the
'sql-createsubscription-params-with-failover' takes you to the
failover property of Create-Subscription page. Won't that suffice?

PS: Yes, the current links in 80_2 are fine.

~

2a.
In hindsight, maybe it is simpler just to say "option of CREATE
SUBSCRIPTION." instead of "option of the CREATE SUBSCRIPTION command."

~

2b.
Anyway, the "CREATE SUBSCRIPTION" should be rendered as a <command>

======

3.
+/*
+ * Flag to tell if we are syncing replication slots. Unlike the 'syncing' flag
+ * in SlotSyncCtxStruct, this flag is true only if the current process is
+ * performing slot synchronization. This flag is also used as safe-guard
+ * to clean-up shared 'syncing' flag of SlotSyncCtxStruct if some problem
+ * happens while we are in the process of synchronization.
+ */

3a.
It looks confusing to use the same word "process" to mean 2 different things.

SUGGESTION
This flag is also used as a safeguard to reset the shared 'syncing'
flag of SlotSyncCtxStruct if some problem occurs while synchronizing.

~

3b.
TBH, I didn't think that 2nd sentence comment needed to be here -- it
seemed more appropriate to say this comment inline where it does this
logic in the function SyncReplicationSlots()

~~~

4. local_sync_slot_required

+/*
+ * Helper function to check if local_slot is required to be retained.
+ *
+ * Return false either if local_slot does not exist on the remote_slots list or
+ * is invalidated while the corresponding remote slot in the list is still
+ * valid, otherwise return true.
+ */

/does not exist on the remote_slots list/does not exist in the
remote_slots list/

/while the corresponding remote slot in the list is still valid/while
the corresponding remote slot is still valid/

~~~

5.
+ bool locally_invalidated = false;
+ bool remote_exists = false;
+

IMO it is more natural to declare these in the other order since the
function logic assigns/tests them in the other order.

~~~

6.
+
+ if (!remote_exists || locally_invalidated)
+ return false;
+
+ return true;

IMO it would be both simpler and easier to understand if this was
written as one line:

return remote_exists && !locally_invalidated;

~~~

7.
+ * Note: Change of 'wal_level' on the primary server to a level lower than
+ * logical may also result in slots invalidation and removal on standby. This
+ * is because such 'wal_level' change is only possible if the logical slots
+ * are removed on the primary server, so it's expected to see the slots being
+ * invalidated and removed on the standby too (and re-created if they are
+ * re-created on the primary).

/may also result in slots invalidation/may also result in slot invalidation/

/removal on standby/removal on the standby/

~~~

8.
+ /* Drop the local slot f it is not required to be retained. */
+ if (!local_sync_slot_required(local_slot, remote_slot_list))

I didn't think this comment was needed because IMO the function name
is self-explanatory. Anyway, if you do want to keep it, then there is
a typo to fix:

/f it is/if it is/

~~~

9.
+ * Update the LSNs and persist the local synced slot for further syncs if the
+ * remote restart_lsn and catalog_xmin have caught up with the local ones,
+ * otherwise do nothing.

Something about "persist ... for further syncs" wording seems awkward
to me but I wasn't sure exactly what it should be. When I fed this
comment into ChatGPT it interpreted "further" as "future" which seemed
better.

e.g.
If the remote restart_lsn and catalog_xmin have caught up with the
local ones, then update the LSNs and store the local synced slot for
future synchronization; otherwise, do nothing.

Maybe that is a better way to express this comment?

~~~

10.
+/*
+ * Validates if all necessary GUCs for slot synchronization are set
+ * appropriately, otherwise raise ERROR.
+ */

/Validates if all/Check all/

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-02-08 07:01:00 Re: What about Perl autodie?
Previous Message Zhijie Hou (Fujitsu) 2024-02-08 06:35:55 RE: Synchronizing slots from primary to standby