Re: Synchronizing slots from primary to standby

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-02-29 02:59:19
Message-ID: CAHut+Pvqr5wTK3wpcRzSXxiAmBoRa249w0NMjzUOkJg3TTiLCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 28, 2024 at 1:23 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Tuesday, February 27, 2024 3:18 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
...
> > 20.
> > +#
> > +# | ----> standby1 (primary_slot_name = sb1_slot) # | ----> standby2
> > +(primary_slot_name = sb2_slot) # primary ----- | # | ----> subscriber1
> > +(failover = true) # | ----> subscriber2 (failover = false)
> >
> > In the diagram, the "--->" means a mixture of physical standbys and logical
> > pub/sub replication. Maybe it can be a bit clearer?
> >
> > SUGGESTION
> >
> > # primary (publisher)
> > #
> > # (physical standbys)
> > # | ----> standby1 (primary_slot_name = sb1_slot)
> > # | ----> standby2 (primary_slot_name = sb2_slot)
> > #
> > # (logical replication)
> > # | ----> subscriber1 (failover = true, slot_name = lsub1_slot)
> > # | ----> subscriber2 (failover = false, slot_name = lsub2_slot)
> >
>
> I think one can distinguish it based on the 'standby' and 'subscriber' as well, because
> 'standby' normally refer to physical standby while the other refer to logical.
>

Ok, but shouldn't it at least include info about the logical slot
names associated with the subscribers (slot_name = lsub1_slot,
slot_name = lsub2_slot) like suggested above?

======

Here are some more review comments for v100-0001

======
doc/src/sgml/config.sgml

1.
+ <para>
+ Lists the streaming replication standby server slot names that logical
+ WAL sender processes will wait for. Logical WAL sender processes will
+ send decoded changes to plugins only after the specified replication
+ slots confirm receiving WAL. This guarantees that logical replication
+ slots with failover enabled do not consume changes until those changes
+ are received and flushed to corresponding physical standbys. If a
+ logical replication connection is meant to switch to a physical standby
+ after the standby is promoted, the physical replication slot for the
+ standby should be listed here. Note that logical replication will not
+ proceed if the slots specified in the standby_slot_names do
not exist or
+ are invalidated.
+ </para>

Is that note ("Note that logical replication will not proceed if the
slots specified in the standby_slot_names do not exist or are
invalidated") meant only for subscriptions marked for 'failover' or
any subscription? Maybe wording can be modified to help clarify it?

======
src/backend/replication/slot.c

2.
+/*
+ * A helper function to validate slots specified in GUC standby_slot_names.
+ */
+static bool
+validate_standby_slots(char **newval)
+{
+ char *rawname;
+ List *elemlist;
+ bool ok;
+
+ /* Need a modifiable copy of string */
+ rawname = pstrdup(*newval);
+
+ /* Verify syntax and parse string into a list of identifiers */
+ ok = SplitIdentifierString(rawname, ',', &elemlist);
+
+ if (!ok)
+ {
+ GUC_check_errdetail("List syntax is invalid.");
+ }
+
+ /*
+ * If the replication slots' data have been initialized, verify if the
+ * specified slots exist and are logical slots.
+ */
+ else if (ReplicationSlotCtl)
+ {
+ foreach_ptr(char, name, elemlist)
+ {
+ ReplicationSlot *slot;
+
+ slot = SearchNamedReplicationSlot(name, true);
+
+ if (!slot)
+ {
+ GUC_check_errdetail("replication slot \"%s\" does not exist",
+ name);
+ ok = false;
+ break;
+ }
+
+ if (!SlotIsPhysical(slot))
+ {
+ GUC_check_errdetail("\"%s\" is not a physical replication slot",
+ name);
+ ok = false;
+ break;
+ }
+ }
+ }
+
+ pfree(rawname);
+ list_free(elemlist);
+ return ok;
+}

2a.
I didn't mention this previously because I thought this function was
not going to change anymore, but since Bertrand suggested some changes
[1], I will say IMO the { } are fine here for the single statement,
but I think it will be better to rearrange this code to be like below.
Having a 2nd NOP 'else' gives a much better place where you can put
your ReplicationSlotCtl comment.

if (!ok)
{
GUC_check_errdetail("List syntax is invalid.");
}
else if (!ReplicationSlotCtl)
{
<Insert big comment here about why it is OK to skip when
ReplicationSlotCtl is NULL>
}
else
{
foreach_ptr ...
}

~

2b.
In any case even if don't refactor anything I still think you need to
extend that comment to explain how skipping ReplicationSlotCtl is NULL
is only OK because there will be some later checking also done in the
FilterStandbySlots() function to catch any config problems.

------
[1] https://www.postgresql.org/message-id/Zd8ahZXw82ieFxX/%40ip-10-97-1-34.eu-west-3.compute.internal

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2024-02-29 03:01:47 Re: Improve readability by using designated initializers when possible
Previous Message Masahiko Sawada 2024-02-29 02:47:51 Re: Making the initial and maximum DSA segment sizes configurable