Re: How can end users know the cause of LR slot sync delays?

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: How can end users know the cause of LR slot sync delays?
Date: 2025-09-18 07:46:47
Message-ID: CAE9k0Pn2z2oMX1LHgAyUUfmSoS=GkcpEeQZJm4UYMSNz+8_04g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Amit,

On Thu, Sep 18, 2025 at 11:31 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Sep 17, 2025 at 8:19 PM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
> >
> > On Wed, Sep 17, 2025 at 5:14 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Sep 17, 2025 at 4:24 PM Hayato Kuroda (Fujitsu)
> > > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > > >
> > > > Dear Shlok,
> > > >
> > > > Thanks for creating the patch. Personally I prefer approach2; approach1 cannot
> > > > indicate the current status of synchronization, it just shows the history.
> > > > I feel approach2 has more information than approach1.
> > > >
> > >
> > > I also think so but Ashutosh thought that it would be hacky. Ashutosh,
> > > did you have an opinion on this matter after seeing the patches?
> > >
> >
> > Yes, I’ve looked into both the patches. Approach 1 seems quite
> > straightforward. In approach 2, we need to pass some additional
> > arguments to update_local_sync_slot and
> > update_and_persist_local_synced_slot, which makes it feel a little
> > less clean compared to approach 1, where we simply add a new function
> > and call it directly.
> >
>
> This is because the approach-1 doesn't show the latest value of
> sync_status. I mean in the latest cycle if the sync is successful, it
> won't update the stats which I am not sure is correct because users
> may want to know the recent status of sync cycle. Otherwise, the patch
> should be almost the same.

This should be manageable, no? If we add an additional call to the
stats report function immediately after ReplicationSlotPersist(),
wouldn’t that address the issue? Please correct me if I’m overlooking
something.

@@ -600,6 +600,8 @@ update_and_persist_local_synced_slot(RemoteSlot
*remote_slot, Oid remote_dbid)

ReplicationSlotPersist();

+ pgstat_report_replslot_sync_skip(slot, SLOT_SYNC_SKIP_NONE);
+
ereport(LOG,
errmsg("newly created replication slot \"%s\"
is sync-ready now",
remote_slot->name));

In addition to this, should anyone really need to query the skip
reason if pg_replication_slots already shows that the slot is synced
and not temporary? Ideally, users should check the slot status in
pg_replication_slots, and if it indicates the slot is persisted, there
seems little value in enquiring pg_stat_replication_slots for the skip
reason. That said, it’s important to ensure the information in both
views remains consistent.

> I think we can even try to write a patch
> for approach-2 without an additional out parameter in some of the
> functions.

We can aim for this, if possible.

--
With Regards,
Ashutosh Sharma.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2025-09-18 07:59:32 Re: GB18030-2022 Support in PostgreSQL
Previous Message Anthonin Bonnefoy 2025-09-18 07:44:28 Re: [PATCH] jit: fix build with LLVM-21