Re: Synchronizing slots from primary to standby

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-03-01 06:10:57
Message-ID: CAD21AoBhty79zHgXYMNHH1KqO2VtmjRi22QPmYP2yaHC9WFVdw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 1, 2024 at 12:42 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Friday, March 1, 2024 10:17 AM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> >
> > Attach the V102 patch set which addressed Amit and Shveta's comments.
> > Thanks Shveta for helping addressing the comments off-list.
>
> The cfbot reported a compile warning, here is the new version patch which fixed it,
> Also removed some outdate comments in this version.
>

Thank you for updating the patch!

I've reviewed the v102-0001 patch. Here are some comments:

---
I got a compiler warning:

walsender.c:1829:6: warning: variable 'wait_event' is used
uninitialized whenever '&&' condition is false
[-Wsometimes-uninitialized]
if (!XLogRecPtrIsInvalid(RecentFlushPtr) &&
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
walsender.c:1871:7: note: uninitialized use occurs here
if (wait_event == WAIT_EVENT_WAL_SENDER_WAIT_FOR_WAL)
^~~~~~~~~~
walsender.c:1829:6: note: remove the '&&' if its condition is always true
if (!XLogRecPtrIsInvalid(RecentFlushPtr) &&
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
walsender.c:1818:20: note: initialize the variable 'wait_event' to
silence this warning
uint32 wait_event;
^
= 0
1 warning generated.

---
+void
+assign_standby_slot_names(const char *newval, void *extra)
+{
+ List *standby_slots;
+ MemoryContext oldcxt;
+ char *standby_slot_names_cpy = extra;
+

Given that the newval and extra have the same data (standby_slot_names
value), why do we not use newval instead? I think that if we use
newval, we don't need to guc_strdup() in check_standby_slot_names(),
we might need to do list_copy_deep() instead, though. It's not clear
to me as there is no comment.

---
+
+ standby_slot_oldest_flush_lsn = min_restart_lsn;
+

IIUC we expect that standby_slot_oldest_flush_lsn never moves
backward. If so, I think it's better to have an assertion here.

---
Resetting standby_slot_names doesn't work:

=# alter system set standby_slot_names to '';
ERROR: invalid value for parameter "standby_slot_names": """"
DETAIL: replication slot "" does not exist

---
+ /*
+ * Switch to the memory context under which GUC variables are allocated
+ * (GUCMemoryContext).
+ */
+ oldcxt =
MemoryContextSwitchTo(GetMemoryChunkContext(standby_slot_names_cpy));
+ standby_slot_names_list = list_copy(standby_slots);
+ MemoryContextSwitchTo(oldcxt);

Why do we not explicitly switch to GUCMemoryContext?

---
+ if (!standby_slot_names_list)
+ return true;
+

Probably "standby_slot_names_list == NIL" is more consistent with
other places. The same can be applied in WaitForStandbyConfirmation().

---
+ /*
+ * Return true if all the standbys have already caught up to
the passed in
+ * WAL localtion.
+ */
+

s/localtion/location/

---
I was a bit surprised by the fact that standby_slot_names value is
handled in a different way than a similar parameter
synchronous_standby_names. For example, the following command doesn't
work unless there is a replication slot 'slot1, slot2':

=# alter system set standby_slot_names to 'slot1, slot2';
ERROR: invalid value for parameter "standby_slot_names": ""slot1, slot2""
DETAIL: replication slot "slot1, slot2" does not exist

Whereas "alter system set synchronous_standby_names to stb1, stb2;"
can correctly separate the string into 'stb1' and 'stb2'.

Probably it would be okay since this behavior of standby_slot_names is
the same as other GUC parameters that accept a comma-separated string.
But I was confused a bit the first time I used it.

---
+ /*
+ * "*" is not accepted as in that case primary will not be able to know
+ * for which all standbys to wait for. Even if we have physical slots
+ * info, there is no way to confirm whether there is any standby
+ * configured for the known physical slots.
+ */
+ if (strcmp(*newval, "*") == 0)
+ {
+ GUC_check_errdetail("\"*\" is not accepted for
standby_slot_names");
+ return false;
+ }

Why only '*' is checked aside from validate_standby_slots()? I think
that the doc doesn't mention anything about '*' and '*' cannot be used
as a replication slot name. So even if we don't have this check, it
might be no problem.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-03-01 06:21:46 Re: Synchronizing slots from primary to standby
Previous Message Michael Paquier 2024-03-01 05:59:28 Re: DOCS: Avoid using abbreviation "aka"