Re: Synchronizing slots from primary to standby

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-03-06 02:05:57
Message-ID: CAD21AoDqEEu=ELFk1+hOR86PpKAahTia=EBPHE7E4sAu2ORQ4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 5, 2024 at 4:21 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Tuesday, March 5, 2024 2:35 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Tue, Mar 5, 2024 at 9:15 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Mar 5, 2024 at 6:10 AM Peter Smith <smithpb2250(at)gmail(dot)com>
> > wrote:
> > > >
> > > > ======
> > > > src/backend/replication/walsender.c
> > > >
> > > > 5. NeedToWaitForWal
> > > >
> > > > + /*
> > > > + * Check if the standby slots have caught up to the flushed
> > > > + position. It
> > > > + * is good to wait up to the flushed position and then let the
> > > > + WalSender
> > > > + * send the changes to logical subscribers one by one which are
> > > > + already
> > > > + * covered by the flushed position without needing to wait on every
> > > > + change
> > > > + * for standby confirmation.
> > > > + */
> > > > + if (NeedToWaitForStandbys(flushed_lsn, wait_event)) return true;
> > > > +
> > > > + *wait_event = 0;
> > > > + return false;
> > > > +}
> > > > +
> > > >
> > > > 5a.
> > > > The comment (or part of it?) seems misplaced because it is talking
> > > > WalSender sending changes, but that is not happening in this function.
> > > >
> > >
> > > I don't think so. This is invoked only by walsender and a static
> > > function. I don't see any other better place to mention this.
> > >
> > > > Also, isn't what this is saying already described by the other
> > > > comment in the caller? e.g.:
> > > >
> > >
> > > Oh no, here we are explaining the wait order.
> >
> > I think there is a scope of improvement here. The comment inside
> > NeedToWaitForWal() which states that we need to wait here for standbys on
> > flush-position(and not on each change) should be outside of this function. It is
> > too embedded. And the comment which states the order of wait (first flush and
> > then standbys confirmation) should be outside the for-loop in
> > WalSndWaitForWal(), but yes we do need both the comments. Attached a
> > patch (.txt) for comments improvement, please merge if appropriate.
>
> Thanks, I have slightly modified the top-up patch and merged it.
>
> Attach the V106 patch which addressed above and Peter's comments[1].
>

I have one question about PhysicalWakeupLogicalWalSnd():

+/*
+ * Wake up the logical walsender processes with logical failover slots if the
+ * currently acquired physical slot is specified in standby_slot_names GUC.
+ */
+void
+PhysicalWakeupLogicalWalSnd(void)
+{
+ List *standby_slots;
+
+ Assert(MyReplicationSlot && SlotIsPhysical(MyReplicationSlot));
+
+ standby_slots = GetStandbySlotList();
+
+ foreach_ptr(char, name, standby_slots)
+ {
+ if (strcmp(name, NameStr(MyReplicationSlot->data.name)) == 0)
+ {
+
ConditionVariableBroadcast(&WalSndCtl->wal_confirm_rcv_cv);
+ return;
+ }
+ }
+}

IIUC walsender calls this function every time after updating the
slot's restart_lsn, which could be very frequently. I'm concerned that
it could be expensive to do a linear search on the standby_slot_names
list every time. Is it possible to cache the information in walsender
local somehow? For example, the walsender sets a flag in WalSnd after
processing the config file if its slot name is present in
standby_slot_names. That way, they can wake up logical walsenders if
eligible after updating the slot's restart_lsn, without checking the
standby_slot_names value.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-03-06 02:16:38 Re: CREATE DATABASE with filesystem cloning
Previous Message Euler Taveira 2024-03-06 02:05:42 Re: speed up a logical replica setup