Re: Improve pg_sync_replication_slots() to wait for primary to advance

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Ajin Cherian <itsajin(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Improve pg_sync_replication_slots() to wait for primary to advance
Date: 2025-10-07 11:43:17
Message-ID: CAJpy0uA0nQFbAic9TGZ28TqpQ6BoW--Ot2PnuoPjQoowaLLybQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 7, 2025 at 4:49 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Tue, Oct 7, 2025 at 3:47 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Tue, Oct 7, 2025 at 3:24 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> > >
> > > Hello Hackers,
> > >
> > > In an offline discussion, I was considering adding a TAP test for this
> > > patch. However, testing the pg_sync_replication_slots() API’s wait
> > > logic requires a delay of at least 2 seconds, since that’s the
> > > interval the API sleeps before retrying. I’m not sure it’s acceptable
> > > to add a TAP test that increases runtime by 2 seconds.
> > > I’m also wondering if 2 seconds is too long for the API to wait?
> > > Should we reduce it to something like 200 ms instead? I’d appreciate
> > > your feedback.
> > >
> >
> > I feel a shorter nap will be good since it is an API and should finish
> > fast. But too short a nap may result in too many primary pings
> > specially when primary-slots are not advancing. But that case should
> > be a rare one. Shall we have a nap of say 500ms? It is neither too
> > short nor too long. Thoughts?
>
> Shorter nap times mean higher possibility of wasted CPU cycles - that
> should be avoided. Doing that for a test's sake seems wrong. Is there
> a way that the naptime can controlled by external factors such as
> likelihood of an advanced slot (just firing bullets in the dark) or is
> the naptime controllable by user interface like GUC? The test can use
> those interfaces.
>

Yes, we can control naptime based on the fact whether any slots are
being advanced on primary. This is how a slotsync worker does. It
keeps on doubling the naptime if there is no activity on primary
starting from 200ms till max of 30 sec. As soon as activity happens,
naptime is reduced to 200ms again.

thanks
Shveta

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2025-10-07 11:49:13 RE: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE
Previous Message shveta malik 2025-10-07 11:38:50 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart