Re: Improve pg_sync_replication_slots() to wait for primary to advance

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improve pg_sync_replication_slots() to wait for primary to advance
Date: 2025-12-03 03:21:35
Message-ID: CAFPTHDY19pyri1b9cectWH6NNRjWOcPa8YczqajbwXzJtGi_oQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 2, 2025 at 8:35 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Tue, Dec 2, 2025 at 1:58 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> >
> >
> > Attached patch v27 addresses the above comments.
> >
>
> Thanks for the patch. Please find a few comments:
>
> 1)
> + /* The worker pid must not be already assigned in SlotSyncCtx */
> + Assert(SlotSyncCtx->pid == InvalidPid);
> +
>
> We can mention just 'pid' here instead of 'worker pid'
>

Changed.

> 2)
> + /*
> + * The syscache access in fetch_or_refresh_remote_slots() needs a
> + * transaction env.
> + */
> fetch_or_refresh_remote_slots --> fetch_remote_slots
>
> 3)
> SyncReplicationSlots(WalReceiverConn *wrconn)
> {
> +
>
> We can get rid of this blank line at the start of the function.
>

Fixed.

> 4)
> /*
> * Shut down the slot sync worker.
> *
> - * This function sends signal to shutdown slot sync worker, if required. It
> + * This function sends signal to shutdown the slot sync process, if
> required. It
> * also waits till the slot sync worker has exited or
> * pg_sync_replication_slots() has finished.
> */
> Shall we change comment to something like (rephrase if required):
>
> Shut down the slot synchronization.
> This function wakes up the slot sync process (either worker or backend
> running SQL function) and sets stopSignaled=true
> so that worker can exit or SQL function pg_sync_replication_slots()
> can finish. It also waits till the slot sync worker has exited or
> pg_sync_replication_slots() has finished.
>

Changed.

> 5)
> We should change the comment atop 'SlotSyncCtxStruct' as well to
> mention that this pid is either the slot sync worker's pid or
> backend's pid running the SQL function. It is needed by the startup
> process to wake these up, so that they can stop synchronization on
> seeing stopSignaled. <please rephrase as needed>
>

Changed.

> 6)
> + ereport(LOG,
> + errmsg("replication slot synchronization worker is shutting down on
> receiving SIGUSR1"));
>
> SIGUSR1 was actually just a wake-up signal. We may change the comment to:
> replication slot synchronization worker is shutting down as promotion
> is triggered.
>

Changed.

> 7)
> update_synced_slots_inactive_since:
> /* The slot sync worker or SQL function mustn't be running by now */
> - Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
> + Assert(!SlotSyncCtx->syncing);
>
> Regarding this, I see that 'update_synced_slots_inactive_since' is
> only called when we are sure that 'syncing' is false. So shouldn't pid
> be also Invalid by that time? Even if it was backend's pid to start
> with, but since backend has stopped syncing (finished or error-ed
> out),
> pid should be reset to Invalid in such a case. And this Assert need
> not to be changed.
>

Fixed.

> 8)
>
> + if (sync_process_pid != InvalidPid)
> + kill(sync_process_pid, SIGUSR1);
>
> We can write comments to say wake-up slot sync process.
>

Added comments.

Attaching patch v28 addressing these comments.

regards,
Ajin Cherian
Fujitsu Australia

Attachment Content-Type Size
v28-0001-Improve-initial-slot-synchronization-in-pg_sync_.patch application/octet-stream 30.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2025-12-03 03:26:01 Re: Skipping schema changes in publication
Previous Message Peter Geoghegan 2025-12-03 03:08:26 Re: Returning nbtree posting list TIDs in DESC order during backwards scans