Re: Improve pg_sync_replication_slots() to wait for primary to advance

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improve pg_sync_replication_slots() to wait for primary to advance
Date: 2025-09-03 06:28:02
Message-ID: CAFPTHDY1dRDfKX5mHfcm1KCkCTAiFti1VjYp8_6=Sxctq0+Etg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 29, 2025 at 3:01 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Tue, Aug 26, 2025 at 9:58 AM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
>
> Thank You for the patches. Please find a few comments.
>
> 1)
> Change not needed:
>
> * without waiting for SLOTSYNC_RESTART_INTERVAL_SEC.
> + *
> */
>

Removed.

> 2)
> Regarding the naptime in API, I was thinking why not to use
> wait_for_slot_activity() directly? If there are no slots activity, it
> will keep on doubling the naptime starting from 2sec till max of
> 30sec. Thoughts?
>
> We can rename MIN_SLOTSYNC_WORKER_NAPTIME_MS and
> MAX_SLOTSYNC_WORKER_NAPTIME_MS to MIN_SLOTSYNC_NAPTIME_MS and
> MAX_SLOTSYNC_NAPTIME_MS in such a case.
>

Not changing this since further discussion clarified this.

> 3)
> + * target_slot_list - List of RemoteSlot structures to refresh, or NIL to
> + * fetch all failover slots
>
> Can we please change it to:
>
> List of failover logical slots to fetch from primary, or NIL to fetch
> all failover logical slots
>

Changed the variable itself.

> 4)
> In the worker, before each call to synchronize_slots(), we are
> starting a new transaction. It aligns with the previous implementation
> where StartTransaction was inside synchronize_slots(). But in API, we
> are doing StartTransaction once outside of the loop instead of doing
> before each synchronize_slots(), is it intentional? It may keep the
> transaction open for a long duration for the case where slots are not
> getting persisted soon.
>

I've added a new memory context to handle slot_names

> 5)
> With ProcessSlotSyncInterrupts() being removed from API, can you
> please check the behaviour of API on smart-shutdown and rest of the
> shutdown modes? It should behave like other APIs. And what happens if
> I change primary_conninfo to some non-existing server when the API is
> running. Does it error out or keep retrying?
>

I've tested with different types of shutdown and it seems to be
handled corerctly. However, yes, if configuration changed, the API
does not handle. I've written a new function
slotsync_api_reread_config() to specifically handle configuration
changes in API context as it is different from slotsync worker logic.

On Fri, Aug 29, 2025 at 6:50 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Fri, Aug 29, 2025 at 11:42 AM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> >
> > On Fri, Aug 29, 2025 at 3:01 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Aug 26, 2025 at 9:58 AM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> > > 4)
> > > In the worker, before each call to synchronize_slots(), we are
> > > starting a new transaction. It aligns with the previous implementation
> > > where StartTransaction was inside synchronize_slots(). But in API, we
> > > are doing StartTransaction once outside of the loop instead of doing
> > > before each synchronize_slots(), is it intentional? It may keep the
> > > transaction open for a long duration for the case where slots are not
> > > getting persisted soon.
> > >
> >
> > I’ll address your other comments separately, but I wanted to respond
> > to this one first. I did try the approach you suggested, but the issue
> > is that we use the remote_slots list across loop iterations. If we end
> > the transaction at the end of each iteration, the list gets freed and
> > is no longer available for the next pass. Each iteration relies on the
> > remote_slots list from the previous one to build the new list, which
> > is why we can’t free it inside the loop.
>
> Isn't that just a matter of allocating the list in appropriate long
> lived memory context?
>

Yes, changed this.

> Here are some more comments
> <para>
> - The logical replication slots on the primary can be synchronized to
> - the hot standby by using the <literal>failover</literal> parameter of
> + The logical replication slots on the primary can be enabled for
> + synchronization to the hot standby by using the
> + <literal>failover</literal> parameter of
>
> This change corresponds to an existing feature. Should be a separate
> patch, which we may want to backport.
>

Removed.

> - on the standby, the failover slots can be synchronized periodically in
> + <command>CREATE SUBSCRIPTION</command> during slot creation. After that,
> + synchronization can be be performed either manually by calling
> + <link linkend="pg-sync-replication-slots">
> ... snip ...
> - <note>
> - <para>
> - While enabling <link linkend="guc-sync-replication-slots">
> - <varname>sync_replication_slots</varname></link> allows for automatic
> - periodic synchronization of failover slots, they can also be manually
> ... snip ...
>
> I like the current documentation which separates the discussion of two
> methods. I think we should just improve the second paragraph instead
> of deleting it and changing the first one.
>
> * without waiting for SLOTSYNC_RESTART_INTERVAL_SEC.
> + *
>
> unnecessary blank line
>

Changed.

> +/*
> + * Flag used by pg_sync_replication_slots()
> + * to do retries if the slot did not persist while syncing.
> + */
> +static bool slot_persistence_pending = false;
>
> I don't think we need to keep a global variable for this. The variable
> is used only inside SyncReplicationSlots() and the call depth is not
> more than a few calls. From synchronize_slots(), before which the
> variable is reset and after which the variable is checked, to
> update_and_persist_local_synced_slot() which sets the variable, all
> the functions return bool. All of them can be made to return an
> integer status instead indicating the result of the operation. If we
> do so we could check the return value of synchronize_slots() to decide
> whether to retry or not, isntead of maintaining a global variable
> which has a much wider scope than required. It's difficult to keep it
> updated over the time.
>

The problem is that all those calls synchronize_slots() and
update_and_persist_local_synced_slot() are shared with the slotsync
worker logic and API. Hence, changing this will affect slotsync_worker
logic as well. While the API needs to spefically retry only if the
initial sync fails, the slotsync worker will always be retrying. I
feel using a global variable is a more convenient way of doing this.

> + * Parameters:
> + * wrconn - Connection to the primary server
> + * target_slot_list - List of RemoteSlot structures to refresh, or NIL to
> + * fetch all failover slots
> *
> - * Returns TRUE if any of the slots gets updated in this sync-cycle.
>
> Need to describe the return value.
>

Added.

> */
> -static bool
> -synchronize_slots(WalReceiverConn *wrconn)
> +static List *
> +fetch_remote_slots(WalReceiverConn *wrconn, List *target_slot_list)
>
> I like the way this function is broken down into two. That breaking down is
> useful even without this feature.
>
> - /* Construct the remote_slot tuple and synchronize each slot locally */
> + /* Process the slot information */
>
> Probably these comments don't make much sense or they repeat what's
> already there in the function prologue.
>
> else
> - /* Create list of remote slots */
> + /* Add to updated list */
>
> Probably these comments don't make much sense or they repeat what's
> already there in the function prologue.
>

Removed these comments.

> @@ -1276,7 +1331,7 @@ wait_for_slot_activity(bool some_slot_updated)
>
> The function is too cute to be useful. The code should be part of
> ReplSlotSyncWorkerMain() just like other worker's main functions.
>

But this wouldn't be part of this feature.

> void
> SyncReplicationSlots(WalReceiverConn *wrconn)
> {
> PG_ENSURE_ERROR_CLEANUP(slotsync_failure_callback, PointerGetDatum(wrconn));
> {
>
> Shouldn't this function call CheckForInterrupts() somewhere in the
> loop since it could be potentially an infinite loop?

I've tested this and I see that interrupts are being handled by
sending SIGQUIT and SIGINT to the backend process.

Attaching v10 with the above changes.

regards,
Ajin Cerian
Fujitsu Australia

Attachment Content-Type Size
v10-0001-Improve-initial-slot-synchronization-in-pg_sync_.patch application/octet-stream 20.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2025-09-03 06:32:41 Re: Orphan page in _bt_split
Previous Message Konstantin Knizhnik 2025-09-03 06:25:33 Re: Orphan page in _bt_split