RE: Improve pg_sync_replication_slots() to wait for primary to advance

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Yilin Zhang <jiezhilove(at)126(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Improve pg_sync_replication_slots() to wait for primary to advance
Date: 2026-02-11 08:44:33
Message-ID: TY4PR01MB16907DD16098BE3B20486D4569463A@TY4PR01MB16907.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, February 10, 2026 5:34 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> Thanks for the patch.
>
> + * Note that we do not wait and retry if the local slot has been invalidated.
> + * In such cases, the corresponding remote slot on the primary is
> + likely
> + * invalidated as well. Even if only the local slot is invalidated,
> + simply
> + * retrying synchronization won't suffice, as it requires further user
> + actions
> + * to verify the server configuration, drop the invalidated slot.
>
> On thinking more, I realized that if the local slot is invalidated alone while the
> remote-slot is not, we do not wait for the user to drop such an invalidated
> slot. Instead slot-sync will drop it internally. See comments atop
> drop_local_obsolete_slots(). This makes me wonder whether such a case,
> where only the local slot is invalidated, should also set slotsync_pending =
> true, since there is a good chance it will get synchronized in subsequent runs.
> OTOH, if we do not wait for such a slot, we could end up in a situation where
> the slot (remote one) is valid pre-failover but is invalid (synced one) post-
> failover, even after running the API immediately before switchover. Thoughts?

I agree that it makes sense to retry when only the local slot is invalidated.

Here is the updated patch.

Best Regards,
Hou zj

Attachment Content-Type Size
v5-0004-Add-a-taptest.patch application/octet-stream 3.3 KB
v5-0001-Refactoring-remove-some-unnecessary-func-paramete.patch application/octet-stream 6.7 KB
v5-0002-Refactoring-move-similar-checks-to-a-central-plac.patch application/octet-stream 6.2 KB
v5-0003-Improve-the-retry-logic-in-pg_sync_replication_sl.patch application/octet-stream 9.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message wangpeng 2026-02-11 09:22:27 Fix wrong log in pgstat_report_checksum_failures_in_db()
Previous Message Anthonin Bonnefoy 2026-02-11 08:43:51 Re: Propagate XLogFindNextRecord error to callers