Re: Synchronizing slots from primary to standby

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-02-27 07:17:44
Message-ID: CAHut+PsGTQb0ijYOfd=x=qX5Aubj=F-pqnfjQ47Jt3MeyBipqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are some review comments for v99-0001

==========
0. GENERAL.

+#standby_slot_names = '' # streaming replication standby server slot names that
+ # logical walsender processes will wait for

IMO the GUC name is too generic. There is nothing in this name to
suggest it has anything to do with logical slot synchronization; that
meaning is only found in the accompanying comment -- it would be
better if the GUC name itself were more self-explanatory.

e.g. Maybe like 'wal_sender_sync_standby_slot_names' or
'wal_sender_standby_slot_names', 'wal_sender_wait_for_standby_slots',
or ...

(Of course, this has some impact across docs and comments and
variables in the patch).

==========
Commit Message

1.
A new parameter named standby_slot_names is introduced.

Maybe quote the GUC names here to make it more readable.

~~

2.
Additionally, The SQL functions pg_logical_slot_get_changes and
pg_replication_slot_advance are modified to wait for the replication slots
mentioned in standby_slot_names to catch up before returning the changes
to the user.

~

2a.
"pg_replication_slot_advance" is a typo? Did you mean
pg_logical_replication_slot_advance?

~

2b.
The "before returning the changes to the user" seems like it is
referring only to the first function.

Maybe needs slight rewording like:
/before returning the changes to the user./ before returning./

==========
doc/src/sgml/config.sgml

3. standby_slot_names

+ <para>
+ List of physical slots guarantees that logical replication slots with
+ failover enabled do not consume changes until those changes
are received
+ and flushed to corresponding physical standbys. If a logical
replication
+ connection is meant to switch to a physical standby after the
standby is
+ promoted, the physical replication slot for the standby
should be listed
+ here. Note that logical replication will not proceed if the slots
+ specified in the standby_slot_names do not exist or are invalidated.
+ </para>

The wording doesn't seem right. IMO this should be worded much like
how this GUC is described in guc_tables.c

e.g something a bit like:

Lists the streaming replication standby server slot names that logical
WAL sender processes will wait for. Logical WAL sender processes will
send decoded changes to plugins only after the specified replication
slots confirm receiving WAL. This guarantees that logical replication
slots with failover enabled do not consume changes until those changes
are received and flushed to corresponding physical standbys...

==========
doc/src/sgml/logicaldecoding.sgml

4. Section 48.2.3 Replication Slot Synchronization

+ It's also highly recommended that the said physical replication slot
+ is named in
+ <link linkend="guc-standby-slot-names"><varname>standby_slot_names</varname></link>
+ list on the primary, to prevent the subscriber from consuming changes
+ faster than the hot standby. But once we configure it, then
certain latency
+ is expected in sending changes to logical subscribers due to wait on
+ physical replication slots in
+ <link linkend="guc-standby-slot-names"><varname>standby_slot_names</varname></link>

4a.
/It's also highly/It is highly/

~

4b.

BEFORE
But once we configure it, then certain latency is expected in sending
changes to logical subscribers due to wait on physical replication
slots in <link linkend="guc-standby-slot-names"><varname>standby_slot_names</varname></link>

SUGGESTION
Even when correctly configured, some latency is expected when sending
changes to logical subscribers due to the waiting on slots named in
standby_slot_names.

==========
.../replication/logical/logicalfuncs.c

5. pg_logical_slot_get_changes_guts

+ if (XLogRecPtrIsInvalid(upto_lsn))
+ wait_for_wal_lsn = end_of_wal;
+ else
+ wait_for_wal_lsn = Min(upto_lsn, end_of_wal);
+
+ /*
+ * Wait for specified streaming replication standby servers (if any)
+ * to confirm receipt of WAL up to wait_for_wal_lsn.
+ */
+ WaitForStandbyConfirmation(wait_for_wal_lsn);

Perhaps those statements all belong together with the comment up-front. e.g.

+ /*
+ * Wait for specified streaming replication standby servers (if any)
+ * to confirm receipt of WAL up to wait_for_wal_lsn.
+ */
+ if (XLogRecPtrIsInvalid(upto_lsn))
+ wait_for_wal_lsn = end_of_wal;
+ else
+ wait_for_wal_lsn = Min(upto_lsn, end_of_wal);
+ WaitForStandbyConfirmation(wait_for_wal_lsn);

==========
src/backend/replication/logical/slotsync.c

==========
src/backend/replication/slot.c

6.
+static bool
+validate_standby_slots(char **newval)
+{
+ char *rawname;
+ List *elemlist;
+ ListCell *lc;
+ bool ok;
+
+ /* Need a modifiable copy of string */
+ rawname = pstrdup(*newval);
+
+ /* Verify syntax and parse string into a list of identifiers */
+ ok = SplitIdentifierString(rawname, ',', &elemlist);
+
+ if (!ok)
+ {
+ GUC_check_errdetail("List syntax is invalid.");
+ }
+
+ /*
+ * If the replication slots' data have been initialized, verify if the
+ * specified slots exist and are logical slots.
+ */
+ else if (ReplicationSlotCtl)
+ {
+ foreach(lc, elemlist)

6a.
So, if the ReplicationSlotCtl is NULL, it is possible to return
ok=true without ever checking if the slots exist or are of the correct
kind. I am wondering what are the ramifications of that. -- e.g.
assuming names are OK when maybe they aren't OK at all. AFAICT this
works because it relies on getting subsequent WARNINGS when calling
FilterStandbySlots(). If that is correct then maybe the comment here
can be enhanced to say so.

Indeed, if it works like that, now I am wondering do we need this for
loop validation at all. e.g. it seems just a matter of timing whether
we get ERRORs validating the GUC here, or WARNINGS later in the
FilterStandbySlots. Maybe we don't need the double-checking and it is
enough to check in FilterStandbySlots?

~

6b.
AFAIK there are alternative foreach macros available now, so you
shouldn't need to declare the ListCell.

~~~

7. check_standby_slot_names

+bool
+check_standby_slot_names(char **newval, void **extra, GucSource source)
+{
+ if (strcmp(*newval, "") == 0)
+ return true;

Using strcmp seems like an overkill way to check for empty string.

SUGGESTION

if (*newval == '\0')
return true;

~~~

8.
+ if (strcmp(*newval, "*") == 0)
+ {
+ GUC_check_errdetail("\"%s\" is not accepted for standby_slot_names",
+ *newval);
+ return false;
+ }

It seems overkill to use a format specifier when "*" is already the known value.

SUGGESTION
GUC_check_errdetail("Wildcard \"*\" is not accepted for standby_slot_names.");

~~~

9.
+ /* Now verify if the specified slots really exist and have correct type */
+ if (!validate_standby_slots(newval))
+ return false;

As in a prior comment, if ReplicationSlotCtl is NULL then it is not
always going to do exactly what that comment says it is doing...

~~~

10. assign_standby_slot_names

+ if (!SplitIdentifierString(standby_slot_names_cpy, ',', &standby_slots))
+ {
+ /* This should not happen if GUC checked check_standby_slot_names. */
+ elog(ERROR, "invalid list syntax");
+ }

I didn't see how it is possible to get here without having first
executed check_standby_slot_names. But, if it can happen, then maybe
describe the scenario in the comment.

~~~

11.
+ * Note that since we do not support syncing slots to cascading standbys, we
+ * return NIL if we are running in a standby to indicate that no standby slots
+ * need to be waited for, regardless of the copy flag value.

I didn't understand the relevance of even mentioning "regardless of
the copy flag value".

~~~

12. FilterStandbySlots

+ errhint("Consider starting standby associated with \"%s\" or amend
standby_slot_names.",
+ name));

This errhint should use a format substitution for the GUC
"standby_slot_names" for consistency with everything else.

~~~

13. WaitForStandbyConfirmation

+ /*
+ * We wait for the slots in the standby_slot_names to catch up, but we
+ * use a timeout (1s) so we can also check the if the
+ * standby_slot_names has been changed.
+ */
+ ConditionVariableTimedSleep(&WalSndCtl->wal_confirm_rcv_cv, 1000,
+ WAIT_EVENT_WAIT_FOR_STANDBY_CONFIRMATION);

Typo "the if the"

==========
src/backend/replication/slotfuncs.c

14. pg_physical_replication_slot_advance
+
+ PhysicalWakeupLogicalWalSnd();

Should this have a comment to say what it is for?

==========
src/backend/replication/walsender.c

15.
+/*
+ * Wake up the logical walsender processes with failover enabled slots if the
+ * currently acquired physical slot is specified in standby_slot_names GUC.
+ */
+void
+PhysicalWakeupLogicalWalSnd(void)
+{
+ ListCell *lc;
+ List *standby_slots;
+
+ Assert(MyReplicationSlot && SlotIsPhysical(MyReplicationSlot));
+
+ standby_slots = GetStandbySlotList(false);
+
+ foreach(lc, standby_slots)
+ {
+ char *name = lfirst(lc);
+
+ if (strcmp(name, NameStr(MyReplicationSlot->data.name)) == 0)
+ {
+ ConditionVariableBroadcast(&WalSndCtl->wal_confirm_rcv_cv);
+ return;
+ }
+ }
+}

15a.
There already exists another function called WalSndWakeup(bool
physical, bool logical), so I think this new one should use a similar
name pattern -- e.g. maybe like WalSndWakeupLogicalForSlotSync or ...

~

15b.
IIRC there are some new List macros you can use instead of needing to
declare the ListCell?

==========
.../utils/activity/wait_event_names.txt

16.
+WAIT_FOR_STANDBY_CONFIRMATION "Waiting for the WAL to be received by
physical standby."

Moving the 'the' will make this more consistent with all other
"Waiting for WAL..." names.

SUGGESTION
Waiting for WAL to be received by the physical standby.

==========
src/backend/utils/misc/guc_tables.c

17.
+ {
+ {"standby_slot_names", PGC_SIGHUP, REPLICATION_PRIMARY,
+ gettext_noop("Lists streaming replication standby server slot "
+ "names that logical WAL sender processes will wait for."),
+ gettext_noop("Decoded changes are sent out to plugins by logical "
+ "WAL sender processes only after specified "
+ "replication slots confirm receiving WAL."),
+ GUC_LIST_INPUT | GUC_LIST_QUOTE
+ },
+ &standby_slot_names,
+ "",
+ check_standby_slot_names, assign_standby_slot_names, NULL
+ },

The wording of the detail msg feels kind of backwards to me.

BEFORE
Decoded changes are sent out to plugins by logical WAL sender
processes only after specified replication slots confirm receiving
WAL.

SUGGESTION
Logical WAL sender processes will send decoded changes to plugins only
after the specified replication slots confirm receiving WAL.

==========
src/backend/utils/misc/postgresql.conf.sample

18.
+#standby_slot_names = '' # streaming replication standby server slot names that
+ # logical walsender processes will wait for

I'm not sure this is the best GUC name. See the general comment #0
above in this post.

==========
src/include/replication/slot.h

==========
src/include/replication/walsender.h

==========
src/include/replication/walsender_private.h

==========
src/include/utils/guc_hooks.h

==========
src/test/recovery/t/006_logical_decoding.pl

19.
+# Pass failover=true (last-arg), it should not have any impact on advancing.

SUGGESTION
Passing failover=true (last arg) should not have any impact on advancing.

==========
.../t/040_standby_failover_slots_sync.pl

20.
+#
+# | ----> standby1 (primary_slot_name = sb1_slot)
+# | ----> standby2 (primary_slot_name = sb2_slot)
+# primary ----- |
+# | ----> subscriber1 (failover = true)
+# | ----> subscriber2 (failover = false)

In the diagram, the "--->" means a mixture of physical standbys and
logical pub/sub replication. Maybe it can be a bit clearer?

SUGGESTION

# primary (publisher)
#
# (physical standbys)
# | ----> standby1 (primary_slot_name = sb1_slot)
# | ----> standby2 (primary_slot_name = sb2_slot)
#
# (logical replication)
# | ----> subscriber1 (failover = true, slot_name = lsub1_slot)
# | ----> subscriber2 (failover = false, slot_name = lsub2_slot)

~~~

21.
+# Set up is configured in such a way that the logical slot of subscriber1 is
+# enabled failover, thus it will wait for the physical slot of
+# standby1(sb1_slot) to catch up before sending decoded changes to subscriber1.

/is enabled failover/is enabled for failover/

~~~

22.
+# Create another subscriber node without enabling failover, wait for sync to
+# complete
+my $subscriber2 = PostgreSQL::Test::Cluster->new('subscriber2');
+$subscriber2->init;
+$subscriber2->start;
+$subscriber2->safe_psql(
+ 'postgres', qq[
+ CREATE TABLE tab_int (a int PRIMARY KEY);
+ CREATE SUBSCRIPTION regress_mysub2 CONNECTION '$publisher_connstr'
PUBLICATION regress_mypub WITH (slot_name = lsub2_slot);
+]);
+
+$subscriber1->wait_for_subscription_sync;
+

Is this meant to wait for 'subscription2'?

~~~

23.
# Stop the standby associated with the specified physical replication slot so
# that the logical replication slot won't receive changes until the standby
# comes up.

Maybe this can give the values for better understanding:

SUGGESTION
Stop the standby associated with the specified physical replication
slot (sb1_slot) so that the logical replication slot (lsub1_slot)
won't receive changes until the standby comes up.

~~~

24.
+# Wait for the standby that's up and running gets the data from primary

SUGGESTION
Wait until the standby2 that's still running gets the data from the primary.

~~~

25.
+# Wait for the subscription that's up and running and is not enabled
for failover.
+# It gets the data from primary without waiting for any standbys.

SUGGESTION
Wait for subscription2 to get the data from the primary. This
subscription was not enabled for failover so it gets the data without
waiting for any standbys.

~~~

26.
+# The subscription that's up and running and is enabled for failover
+# doesn't get the data from primary and keeps waiting for the
+# standby specified in standby_slot_names.

SUGGESTION
The subscription1 was enabled for failover so it doesn't get the data
from primary and keeps waiting for the standby specified in
standby_slot_names (sb1_slot aka standby1).

~~~

27.
+# Start the standby specified in standby_slot_names and wait for it to catch
+# up with the primary.

SUGGESTION
Start the standby specified in standby_slot_names (sb1_slot aka
standby1) and wait for it to catch up with the primary.

~~~

28.
+# Now that the standby specified in standby_slot_names is up and running,
+# primary must send the decoded changes to subscription enabled for failover
+# While the standby was down, this subscriber didn't receive any data from
+# primary i.e. the primary didn't allow it to go ahead of standby.

SUGGESTION
Now that the standby specified in standby_slot_names is up and
running, the primary can send the decoded changes to the subscription
enabled for failover (i.e. subscription1). While the standby was down,
subscription1 didn't receive any data from the primary. i.e. the
primary didn't allow it to go ahead of standby.

~~~

29.
+# Stop the standby associated with the specified physical replication slot so
+# that the logical replication slot won't receive changes until the standby
+# slot's restart_lsn is advanced or the slot is removed from the
+# standby_slot_names list.
+$primary->safe_psql('postgres', "TRUNCATE tab_int;");
+$primary->wait_for_catchup('regress_mysub1');
+$standby1->stop;

Isn't this fragment more like the first step of the *next* TEST
instead of the last step of this one?

~~~

30.
+##################################################
+# Verify that when using pg_logical_slot_get_changes to consume changes from a
+# logical slot with failover enabled, it will also wait for the slots specified
+# in standby_slot_names to catch up.
+##################################################

AFAICT this test is checking only that the function cannot return
while waiting for the stopped standby, but it doesn't seem to check
that it *does* return when the stopped standby comes alive again.

~~~

31.
+$result =
+ $subscriber1->safe_psql('postgres', "SELECT count(*) = 0 FROM tab_int;");
+is($result, 't',
+ "subscriber1 doesn't get data as the sb1_slot doesn't catch up");

Do you think this fragment should have a comment?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2024-02-27 07:57:31 Re: Improve readability by using designated initializers when possible
Previous Message Andrew Dunstan 2024-02-27 07:10:12 Re: More new SQL/JSON item methods