Re: Improve pg_sync_replication_slots() to wait for primary to advance

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improve pg_sync_replication_slots() to wait for primary to advance
Date: 2025-10-09 09:26:28
Message-ID: CAExHW5uz2Z2wBg=F7U4VV24URLYu3AA+-R6eH7Pkj5Sr9qi_wA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 9, 2025 at 2:42 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Oct 7, 2025 at 4:49 PM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > On Tue, Oct 7, 2025 at 3:47 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Oct 7, 2025 at 3:24 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> > > >
> > > > Hello Hackers,
> > > >
> > > > In an offline discussion, I was considering adding a TAP test for this
> > > > patch. However, testing the pg_sync_replication_slots() API’s wait
> > > > logic requires a delay of at least 2 seconds, since that’s the
> > > > interval the API sleeps before retrying. I’m not sure it’s acceptable
> > > > to add a TAP test that increases runtime by 2 seconds.
> > > > I’m also wondering if 2 seconds is too long for the API to wait?
> > > > Should we reduce it to something like 200 ms instead? I’d appreciate
> > > > your feedback.
> > > >
> > >
> > > I feel a shorter nap will be good since it is an API and should finish
> > > fast. But too short a nap may result in too many primary pings
> > > specially when primary-slots are not advancing. But that case should
> > > be a rare one. Shall we have a nap of say 500ms? It is neither too
> > > short nor too long. Thoughts?
> >
> > Shorter nap times mean higher possibility of wasted CPU cycles - that
> > should be avoided.
> >
>
> This seems to be exactly opposite of what you argued previously in email [1].
>
> >
> Doing that for a test's sake seems wrong.
> >
>
> Yeah, if test writing is important to cover this case then we can even
> consider using an injection point.
>

That observation was made to make my point that the logic to decide
naptime in function and in worker should be separate. The naptime in
the function can be significantly smaller than the naptime in the
worker. But making it shorter just for the test's sake isn't a good
idea. If we could use injection points, better.

> >
> Is there
> > a way that the naptime can controlled by external factors such as
> > likelihood of an advanced slot
> >
>
> We already do this for the worker where the naptime is increased
> gradually when there is no activity on the primary. It is better to
> use the same strategy here. This API is not going to be used
> frequently; rather I would say, one would like to use it just before
> planned switchover. So, I feel it is okay even if the wait time is
> slightly higher when actually required. This would prevent adding
> additional code maintenance for API and worker.

That makes sense.

--
Best Wishes,
Ashutosh Bapat

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2025-10-09 09:35:06 Add downloaded files to gitignore
Previous Message Amit Kapila 2025-10-09 09:12:02 Re: Improve pg_sync_replication_slots() to wait for primary to advance