| From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
|---|---|
| To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
| Cc: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Yilin Zhang <jiezhilove(at)126(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Subject: | Re: Improve pg_sync_replication_slots() to wait for primary to advance |
| Date: | 2026-02-17 06:33:11 |
| Message-ID: | CAJpy0uAt6qqSmnx3adzKqYuqgFZ-_g=yxNJP+M3YH0ON135uTQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Feb 17, 2026 at 9:45 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Feb 17, 2026 at 9:13 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Mon, Feb 16, 2026 at 4:35 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Fri, Feb 13, 2026 at 7:54 AM Zhijie Hou (Fujitsu)
> > > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > > >
> > > > Thanks for pushing! Here are the remaining patches.
> > > >
> > >
> > > One thing that bothers me about the remaining patch is that it could
> > > lead to infinite re-tires in the worst case. For example, in first
> > > try, slot-1 is not synced say due to physical replication delays in
> > > flushing WALs up to the confirmed_flush_lsn of that slot, then in next
> > > (re-)try, the same thing happened for slot-2, then in next (re-)try,
> > > slot-3 appears to invalidated on standby but it is valid on primary,
> > > and so on. What do you think?
> >
> > Yes, that is a possibility we cannot rule out. This can also happen
> > during the first invocation of the API (even without the new changes)
> > when we attempt to create new slots, they may remain in a temporary
> > state indefinitely. However, that risk is limited to the initial sync,
> > until the slots are persisted, which is somewhat expected behavior.
> >
>
> Right.
>
> > With the current changes though, the possibility of an indefinite wait
> > exists during every run. So the question becomes: what would be more
> > desirable for users -- for the API to finish with the risk that a few
> > slots are not synced, or for the API to wait longer to ensure that all
> > slots are properly synced?
> >
> > I think that if the primary use case of this API is when a user plans
> > to run it before a scheduled failover, then it would be better for the
> > API to wait and ensure everything is properly synced.
> >
>
> I don't think we can guarantee that all slots are synced as per latest
> primary state in one invocation because some newly created slots can
> anyway be missed.
Oh, right.
> So why take the risk of infinite waits in the API? I
> think it may be better to extend the usage of this API (probably with
> more parameters) based on more user feedback.
I agree.
> > But I am not
> > very very sure on the use case though. What do you think?
> >
> > > Independent of whether we consider the entire patch, the following bit
> > > in the patch in useful as we retry to sync the slots via API.
> > > @@ -218,7 +219,7 @@ update_local_synced_slot(RemoteSlot *remote_slot,
> > > Oid remote_dbid)
> > > * Can get here only if GUC 'synchronized_standby_slots' on the
> > > * primary server was not configured correctly.
> > > */
> > > - ereport(AmLogicalSlotSyncWorkerProcess() ? LOG : ERROR,
> > > + ereport(LOG,
> > > errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > > errmsg("skipping slot synchronization because the received slot sync"
> > > " LSN %X/%08X for slot \"%s\" is ahead of the standby position %X/%08X",
> > >
> >
> > yes. I agree.
> >
>
> Let's wait for Hou-San's opinion on this one.
>
Sure.
thanks
Shveta
| From | Date | Subject | |
|---|---|---|---|
| Next Message | vignesh C | 2026-02-17 06:35:47 | Re: Skipping schema changes in publication |
| Previous Message | shveta malik | 2026-02-17 06:31:53 | Re: Skipping schema changes in publication |