RE: Synchronizing slots from primary to standby

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: RE: Synchronizing slots from primary to standby
Date: 2024-02-28 02:23:27
Message-ID: OS0PR01MB57160590C2CBED48976A113B94582@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, February 27, 2024 3:18 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:

>
> Here are some review comments for v99-0001

Thanks for the comments!

> Commit Message
>
> 1.
> A new parameter named standby_slot_names is introduced.
>
> Maybe quote the GUC names here to make it more readable.

Added.

>
> ~~
>
> 2.
> Additionally, The SQL functions pg_logical_slot_get_changes and
> pg_replication_slot_advance are modified to wait for the replication slots
> mentioned in standby_slot_names to catch up before returning the changes to
> the user.
>
> ~
>
> 2a.
> "pg_replication_slot_advance" is a typo? Did you mean
> pg_logical_replication_slot_advance?

pg_logical_replication_slot_advance is not a user visible function. So the
pg_replication_slot_advance is correct.

>
> ~
>
> 2b.
> The "before returning the changes to the user" seems like it is referring only to
> the first function.
>
> Maybe needs slight rewording like:
> /before returning the changes to the user./ before returning./

Changed.

>
> ==========
> doc/src/sgml/config.sgml
>
> 3. standby_slot_names
>
> + <para>
> + List of physical slots guarantees that logical replication slots with
> + failover enabled do not consume changes until those changes
> are received
> + and flushed to corresponding physical standbys. If a logical
> replication
> + connection is meant to switch to a physical standby after the
> standby is
> + promoted, the physical replication slot for the standby
> should be listed
> + here. Note that logical replication will not proceed if the slots
> + specified in the standby_slot_names do not exist or are invalidated.
> + </para>
>
> The wording doesn't seem right. IMO this should be worded much like how this
> GUC is described in guc_tables.c
>
> e.g something a bit like:
>
> Lists the streaming replication standby server slot names that logical WAL
> sender processes will wait for. Logical WAL sender processes will send
> decoded changes to plugins only after the specified replication slots confirm
> receiving WAL. This guarantees that logical replication slots with failover
> enabled do not consume changes until those changes are received and flushed
> to corresponding physical standbys...

Changed.

>
> ==========
> doc/src/sgml/logicaldecoding.sgml
>
> 4. Section 48.2.3 Replication Slot Synchronization
>
> + It's also highly recommended that the said physical replication slot
> + is named in
> + <link
> linkend="guc-standby-slot-names"><varname>standby_slot_names</varna
> me></link>
> + list on the primary, to prevent the subscriber from consuming changes
> + faster than the hot standby. But once we configure it, then
> certain latency
> + is expected in sending changes to logical subscribers due to wait on
> + physical replication slots in
> + <link
> +
> linkend="guc-standby-slot-names"><varname>standby_slot_names</varna
> me>
> + </link>
>
> 4a.
> /It's also highly/It is highly/
>
> ~
>
> 4b.
>
> BEFORE
> But once we configure it, then certain latency is expected in sending changes
> to logical subscribers due to wait on physical replication slots in <link
> linkend="guc-standby-slot-names"><varname>standby_slot_names</varna
> me></link>
>
> SUGGESTION
> Even when correctly configured, some latency is expected when sending
> changes to logical subscribers due to the waiting on slots named in
> standby_slot_names.

Changed.

>
> ==========
> .../replication/logical/logicalfuncs.c
>
> 5. pg_logical_slot_get_changes_guts
>
> + if (XLogRecPtrIsInvalid(upto_lsn))
> + wait_for_wal_lsn = end_of_wal;
> + else
> + wait_for_wal_lsn = Min(upto_lsn, end_of_wal);
> +
> + /*
> + * Wait for specified streaming replication standby servers (if any)
> + * to confirm receipt of WAL up to wait_for_wal_lsn.
> + */
> + WaitForStandbyConfirmation(wait_for_wal_lsn);
>
> Perhaps those statements all belong together with the comment up-front. e.g.
>
> + /*
> + * Wait for specified streaming replication standby servers (if any)
> + * to confirm receipt of WAL up to wait_for_wal_lsn.
> + */
> + if (XLogRecPtrIsInvalid(upto_lsn))
> + wait_for_wal_lsn = end_of_wal;
> + else
> + wait_for_wal_lsn = Min(upto_lsn, end_of_wal);
> + WaitForStandbyConfirmation(wait_for_wal_lsn);

Changed.

>
> ==========
> src/backend/replication/logical/slotsync.c
>
> ==========
> src/backend/replication/slot.c
>
> 6.
> +static bool
> +validate_standby_slots(char **newval)
> +{
> + char *rawname;
> + List *elemlist;
> + ListCell *lc;
> + bool ok;
> +
> + /* Need a modifiable copy of string */ rawname = pstrdup(*newval);
> +
> + /* Verify syntax and parse string into a list of identifiers */ ok =
> + SplitIdentifierString(rawname, ',', &elemlist);
> +
> + if (!ok)
> + {
> + GUC_check_errdetail("List syntax is invalid."); }
> +
> + /*
> + * If the replication slots' data have been initialized, verify if the
> + * specified slots exist and are logical slots.
> + */
> + else if (ReplicationSlotCtl)
> + {
> + foreach(lc, elemlist)
>
> 6a.
> So, if the ReplicationSlotCtl is NULL, it is possible to return ok=true without
> ever checking if the slots exist or are of the correct kind. I am wondering what
> are the ramifications of that. -- e.g.
> assuming names are OK when maybe they aren't OK at all. AFAICT this works
> because it relies on getting subsequent WARNINGS when calling
> FilterStandbySlots(). If that is correct then maybe the comment here can be
> enhanced to say so.
>
> Indeed, if it works like that, now I am wondering do we need this for loop
> validation at all. e.g. it seems just a matter of timing whether we get ERRORs
> validating the GUC here, or WARNINGS later in the FilterStandbySlots. Maybe
> we don't need the double-checking and it is enough to check in
> FilterStandbySlots?

I think the check is OK so didn’t change this.

>
> ~
>
> 6b.
> AFAIK there are alternative foreach macros available now, so you shouldn't
> need to declare the ListCell.

Changed.

>
> ~~~
>
> 7. check_standby_slot_names
>
> +bool
> +check_standby_slot_names(char **newval, void **extra, GucSource source)
> +{ if (strcmp(*newval, "") == 0) return true;
>
> Using strcmp seems like an overkill way to check for empty string.
>
> SUGGESTION
>
> if (*newval == '\0')
> return true;
>

Changed.

> ~~~
>
> 8.
> + if (strcmp(*newval, "*") == 0)
> + {
> + GUC_check_errdetail("\"%s\" is not accepted for standby_slot_names",
> + *newval); return false; }
>
> It seems overkill to use a format specifier when "*" is already the known value.
>
> SUGGESTION
> GUC_check_errdetail("Wildcard \"*\" is not accepted for
> standby_slot_names.");
>

Changed.

> ~~~
>
> 9.
> + /* Now verify if the specified slots really exist and have correct
> + type */ if (!validate_standby_slots(newval)) return false;
>
> As in a prior comment, if ReplicationSlotCtl is NULL then it is not always going
> to do exactly what that comment says it is doing...

I think the comment is OK, one can check the detail in the
function definition if needed.

>
> ~~~
>
> 10. assign_standby_slot_names
>
> + if (!SplitIdentifierString(standby_slot_names_cpy, ',',
> + &standby_slots)) {
> + /* This should not happen if GUC checked check_standby_slot_names. */
> + elog(ERROR, "invalid list syntax"); }
>
> I didn't see how it is possible to get here without having first executed
> check_standby_slot_names. But, if it can happen, then maybe describe the
> scenario in the comment.

This is sanity check which we don't expect to happen, which follows similar style of preprocessNamespacePath.

>
> ~~~
>
> 11.
> + * Note that since we do not support syncing slots to cascading
> + standbys, we
> + * return NIL if we are running in a standby to indicate that no
> + standby slots
> + * need to be waited for, regardless of the copy flag value.
>
> I didn't understand the relevance of even mentioning "regardless of the copy
> flag value".

Removed.

>
> ~~~
>
> 12. FilterStandbySlots
>
> + errhint("Consider starting standby associated with \"%s\" or amend
> standby_slot_names.",
> + name));
>
> This errhint should use a format substitution for the GUC "standby_slot_names"
> for consistency with everything else.

Changed.

>
> ~~~
>
> 13. WaitForStandbyConfirmation
>
> + /*
> + * We wait for the slots in the standby_slot_names to catch up, but we
> + * use a timeout (1s) so we can also check the if the
> + * standby_slot_names has been changed.
> + */
> + ConditionVariableTimedSleep(&WalSndCtl->wal_confirm_rcv_cv, 1000,
> + WAIT_EVENT_WAIT_FOR_STANDBY_CONFIRMATION);
>
> Typo "the if the"
>

Changed.

> ==========
> src/backend/replication/slotfuncs.c
>
> 14. pg_physical_replication_slot_advance
> +
> + PhysicalWakeupLogicalWalSnd();
>
> Should this have a comment to say what it is for?
>

Added.

> ==========
> src/backend/replication/walsender.c
>
> 15.
> +/*
> + * Wake up the logical walsender processes with failover enabled slots
> +if the
> + * currently acquired physical slot is specified in standby_slot_names GUC.
> + */
> +void
> +PhysicalWakeupLogicalWalSnd(void)
> +{
> + ListCell *lc;
> + List *standby_slots;
> +
> + Assert(MyReplicationSlot && SlotIsPhysical(MyReplicationSlot));
> +
> + standby_slots = GetStandbySlotList(false);
> +
> + foreach(lc, standby_slots)
> + {
> + char *name = lfirst(lc);
> +
> + if (strcmp(name, NameStr(MyReplicationSlot->data.name)) == 0) {
> +ConditionVariableBroadcast(&WalSndCtl->wal_confirm_rcv_cv);
> + return;
> + }
> + }
> +}
>
> 15a.
> There already exists another function called WalSndWakeup(bool physical,
> bool logical), so I think this new one should use a similar name pattern -- e.g.
> maybe like WalSndWakeupLogicalForSlotSync or ...

WalSndWakeup is a general function for both physical and logical sender, but
our new function is specific to physical sender which is more similar to
PhysicalConfirmReceivedLocation/ PhysicalReplicationSlotNewXmin, so I think the
current name is ok.

>
> ~
>
> 15b.
> IIRC there are some new List macros you can use instead of needing to declare
> the ListCell?

Changed.

>
> ==========
> .../utils/activity/wait_event_names.txt
>
> 16.
> +WAIT_FOR_STANDBY_CONFIRMATION "Waiting for the WAL to be received
> by
> physical standby."
>
> Moving the 'the' will make this more consistent with all other "Waiting for
> WAL..." names.
>
> SUGGESTION
> Waiting for WAL to be received by the physical standby.

Changed.

>
> ==========
> src/backend/utils/misc/guc_tables.c
>
> 17.
> + {
> + {"standby_slot_names", PGC_SIGHUP, REPLICATION_PRIMARY,
> + gettext_noop("Lists streaming replication standby server slot "
> + "names that logical WAL sender processes will wait for."),
> + gettext_noop("Decoded changes are sent out to plugins by logical "
> + "WAL sender processes only after specified "
> + "replication slots confirm receiving WAL."), GUC_LIST_INPUT |
> + GUC_LIST_QUOTE }, &standby_slot_names, "", check_standby_slot_names,
> + assign_standby_slot_names, NULL },
>
> The wording of the detail msg feels kind of backwards to me.
>
> BEFORE
> Decoded changes are sent out to plugins by logical WAL sender processes
> only after specified replication slots confirm receiving WAL.
>
> SUGGESTION
> Logical WAL sender processes will send decoded changes to plugins only after
> the specified replication slots confirm receiving WAL.

Changed.

>
> ==========
> src/backend/utils/misc/postgresql.conf.sample
>
> 18.
> +#standby_slot_names = '' # streaming replication standby server slot
> +names that # logical walsender processes will wait for
>
> I'm not sure this is the best GUC name. See the general comment #0 above in
> this post.

As discussed, I didn’t change this.

>
> ==========
> src/include/replication/slot.h
>
> ==========
> src/include/replication/walsender.h
>
> ==========
> src/include/replication/walsender_private.h
>
> ==========
> src/include/utils/guc_hooks.h
>
> ==========
> src/test/recovery/t/006_logical_decoding.pl
>
> 19.
> +# Pass failover=true (last-arg), it should not have any impact on advancing.
>
> SUGGESTION
> Passing failover=true (last arg) should not have any impact on advancing.

Changed.

>
> ==========
> .../t/040_standby_failover_slots_sync.pl
>
> 20.
> +#
> +# | ----> standby1 (primary_slot_name = sb1_slot) # | ----> standby2
> +(primary_slot_name = sb2_slot) # primary ----- | # | ----> subscriber1
> +(failover = true) # | ----> subscriber2 (failover = false)
>
> In the diagram, the "--->" means a mixture of physical standbys and logical
> pub/sub replication. Maybe it can be a bit clearer?
>
> SUGGESTION
>
> # primary (publisher)
> #
> # (physical standbys)
> # | ----> standby1 (primary_slot_name = sb1_slot)
> # | ----> standby2 (primary_slot_name = sb2_slot)
> #
> # (logical replication)
> # | ----> subscriber1 (failover = true, slot_name = lsub1_slot)
> # | ----> subscriber2 (failover = false, slot_name = lsub2_slot)
>

I think one can distinguish it based on the 'standby' and 'subscriber' as well, because
'standby' normally refer to physical standby while the other refer to logical.

> ~~~
>
> 21.
> +# Set up is configured in such a way that the logical slot of
> +subscriber1 is # enabled failover, thus it will wait for the physical
> +slot of # standby1(sb1_slot) to catch up before sending decoded changes to
> subscriber1.
>
> /is enabled failover/is enabled for failover/

Changed.

>
> ~~~
>
> 22.
> +# Create another subscriber node without enabling failover, wait for
> +sync to # complete my $subscriber2 =
> +PostgreSQL::Test::Cluster->new('subscriber2');
> +$subscriber2->init;
> +$subscriber2->start;
> +$subscriber2->safe_psql(
> + 'postgres', qq[
> + CREATE TABLE tab_int (a int PRIMARY KEY); CREATE SUBSCRIPTION
> +regress_mysub2 CONNECTION '$publisher_connstr'
> PUBLICATION regress_mypub WITH (slot_name = lsub2_slot);
> +]);
> +
> +$subscriber1->wait_for_subscription_sync;
> +
>
> Is this meant to wait for 'subscription2'?

Yes, fixed.

>
> ~~~
>
> 23.
> # Stop the standby associated with the specified physical replication slot so #
> that the logical replication slot won't receive changes until the standby #
> comes up.
>
> Maybe this can give the values for better understanding:
>
> SUGGESTION
> Stop the standby associated with the specified physical replication slot
> (sb1_slot) so that the logical replication slot (lsub1_slot) won't receive changes
> until the standby comes up.

Changed.

>
> ~~~
>
> 24.
> +# Wait for the standby that's up and running gets the data from primary
>
> SUGGESTION
> Wait until the standby2 that's still running gets the data from the primary.
>

Changed.

> ~~~
>
> 25.
> +# Wait for the subscription that's up and running and is not enabled
> for failover.
> +# It gets the data from primary without waiting for any standbys.
>
> SUGGESTION
> Wait for subscription2 to get the data from the primary. This subscription was
> not enabled for failover so it gets the data without waiting for any standbys.
>

Changed.

> ~~~
>
> 26.
> +# The subscription that's up and running and is enabled for failover #
> +doesn't get the data from primary and keeps waiting for the # standby
> +specified in standby_slot_names.
>
> SUGGESTION
> The subscription1 was enabled for failover so it doesn't get the data from
> primary and keeps waiting for the standby specified in standby_slot_names
> (sb1_slot aka standby1).
>

Changed.

> ~~~
>
> 27.
> +# Start the standby specified in standby_slot_names and wait for it to
> +catch # up with the primary.
>
> SUGGESTION
> Start the standby specified in standby_slot_names (sb1_slot aka
> standby1) and wait for it to catch up with the primary.
>

Changed.

> ~~~
>
> 28.
> +# Now that the standby specified in standby_slot_names is up and
> +running, # primary must send the decoded changes to subscription
> +enabled for failover # While the standby was down, this subscriber
> +didn't receive any data from # primary i.e. the primary didn't allow it to go
> ahead of standby.
>
> SUGGESTION
> Now that the standby specified in standby_slot_names is up and running, the
> primary can send the decoded changes to the subscription enabled for failover
> (i.e. subscription1). While the standby was down,
> subscription1 didn't receive any data from the primary. i.e. the primary didn't
> allow it to go ahead of standby.
>

Changed.

> ~~~
>
> 29.
> +# Stop the standby associated with the specified physical replication
> +slot so # that the logical replication slot won't receive changes until
> +the standby # slot's restart_lsn is advanced or the slot is removed
> +from the # standby_slot_names list.
> +$primary->safe_psql('postgres', "TRUNCATE tab_int;");
> +$primary->wait_for_catchup('regress_mysub1');
> +$standby1->stop;
>
> Isn't this fragment more like the first step of the *next* TEST instead of the last
> step of this one?
>

Changed.

> ~~~
>
> 30.
> +##################################################
> +# Verify that when using pg_logical_slot_get_changes to consume changes
> +from a # logical slot with failover enabled, it will also wait for the
> +slots specified # in standby_slot_names to catch up.
> +##################################################
>
> AFAICT this test is checking only that the function cannot return while waiting
> for the stopped standby, but it doesn't seem to check that it *does* return
> when the stopped standby comes alive again.
>

Will think about this.

> ~~~
>
> 31.
> +$result =
> + $subscriber1->safe_psql('postgres', "SELECT count(*) = 0 FROM
> +tab_int;"); is($result, 't',
> + "subscriber1 doesn't get data as the sb1_slot doesn't catch up");
>
> Do you think this fragment should have a comment?

Added.

Attach the V100 patch set which addressed above comments.

Best Regards,
Hou zj

Attachment Content-Type Size
v100-0002-Document-the-steps-to-check-if-the-standby-is-r.patch application/octet-stream 7.0 KB
v100-0001-Allow-logical-walsenders-to-wait-for-the-physic.patch application/octet-stream 37.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-02-28 03:19:19 Re: Synchronizing slots from primary to standby
Previous Message Andrew Dunstan 2024-02-28 01:45:26 Re: Logging parallel worker draught