From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
Cc: | Ajin Cherian <itsajin(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Improve pg_sync_replication_slots() to wait for primary to advance |
Date: | 2025-08-04 06:00:49 |
Message-ID: | CAA4eK1JHnqdOaQ-Y4ALJQgeLyoJvg1XPvgL3hss+WOpwChbJmw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Aug 1, 2025 at 2:50 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> 5)
> I tried a test where there were 4 slots on the publisher, where one
> was getting used while the others were not. Initiated
> pg_sync_replication_slots on standby. Forced unused slots to be
> invalidated by setting idle_replication_slot_timeout=60 on primary,
> due to which API finished but gave a warning:
>
> postgres=# SELECT pg_sync_replication_slots();
> WARNING: aborting initial sync for slot "failover_slot"
> DETAIL: This slot was invalidated on the primary server.
> WARNING: aborting initial sync for slot "failover_slot2"
> DETAIL: This slot was invalidated on the primary server.
> WARNING: aborting initial sync for slot "failover_slot3"
> DETAIL: This slot was invalidated on the primary server.
> pg_sync_replication_slots
> ---------------------------
>
> (1 row)
>
> Do we need these warnings here? I think we can have it as a LOG rather
> than having it on console. Thoughts?
>
What is the behaviour of a slotsync worker in the same case? I don't
see any such WARNING messages in the code of slotsync worker, so why
do we want a different behaviour here?
Few other comments:
======================
1.
update_and_persist_local_synced_slot()
{
...
+ /*
+ * For SQL API synchronization, we wait for the remote slot to catch up
+ * here, since we can't assume the SQL API will be called again soon.
+ * We will retry the sync once the slot catches up.
+ *
+ * Note: This will return false if a promotion is triggered on the
+ * standby while waiting, in which case we stop syncing and drop the
+ * temporary slot.
+ */
+ if (!wait_for_primary_slot_catchup(wrconn, remote_slot))
+ return false;
Why is the wait added at this level? Shouldn't it be at API level aka
in SyncReplicationSlots() or pg_sync_replication_slots() similar to
what we do in ReplSlotSyncWorkerMain() for slotsync workers?
2.
REPLICATION_SLOTSYNC_MAIN "Waiting in main loop of slot sync worker."
...
+REPLICATION_SLOTSYNC_PRIMARY_CATCHUP "Waiting for the primary to catch-up."
Can't we reuse existing waitevent REPLICATION_SLOTSYNC_MAIN? We may
want to change the description. Is there a specific reason to add a
new wait_event for this API?
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Yugo Nagata | 2025-08-04 06:14:18 | Doc: Add note for running ANALYZE after ALTER TABLE ... SET EXPRESSION |
Previous Message | Daniil Davydov | 2025-08-04 05:59:40 | Re: Making WAL archiving faster — multi-file support and async ideas |