| From: | Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> |
|---|---|
| To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Use SIGTERM instead of SIGUSR1 for slotsync worker to exit during promotion? |
| Date: | 2026-03-30 04:18:19 |
| Message-ID: | CABdArM5rrhSmFvVL4C5LL0iea-R0HRtB=ZvD=ereoTDa1Tm=NA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, Mar 27, 2026 at 10:49 PM Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>
> On Fri, Mar 27, 2026 at 9:38 PM Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
> > Attached the updated patch.
>
> Thanks for updating the patch! It looks good overall.
>
Thank you Fujii-san for the review.
> Regarding the comments in SlotSyncCtxStruct, since the role of
> stopSignaled field has changed, those comments should be updated
> accordingly? For example,
>
> -------------------------
> - * the SQL function pg_sync_replication_slots(). When the startup process sets
> - * 'stopSignaled' during promotion, it uses this 'pid' to wake up the currently
> - * synchronizing process so that the process can immediately stop its
> - * synchronizing work on seeing 'stopSignaled' set.
> - * Setting 'stopSignaled' is also used to handle the race condition when the
> + * the SQL function pg_sync_replication_slots(). On promotion,
> + * the startup process sets 'stopSignaled' and uses this 'pid' to wake up
> + * the currently synchronizing process so that the process can
> + * immediately stop its synchronizing work.
> + * Setting 'stopSignaled' is used to handle the race condition when the
> -------------------------
>
Updated as suggested.
>
> +/*
> + * Interrupt flag set when PROCSIG_SLOTSYNC_MESSAGE is received, asking the
> + * slotsync worker or pg_sync_replication_slots() to stop because
> + * standby promotion has been triggered.
> + */
> +volatile sig_atomic_t SlotSyncShutdown = false;
>
> For the interrupt flag set in procsignal_sigusr1_handler(), other flags
> use a *Pending suffix (e.g., ProcSignalBarrierPending,
> ParallelApplyMessagePending), so SlotSyncShutdownPending would
> be more consistent.
>
>
> +void
> +HandleSlotSyncMessage(void)
>
> Functions called from ProcessInterrupts() typically use the Process* prefix
> (e.g., ProcessProcSignalBarrier(), ProcessParallelApplyMessages()),
> so ProcessSlotSyncMessage would be more consistent than HandleSlotSyncMessage.
>
Agree, fixed.
>
> + ereport(LOG,
> + errmsg("replication slot synchronization worker will stop because
> promotion is triggered"));
> +
> + proc_exit(0);
> + }
> + else
> + {
> + /*
> + * For the backend executing SQL function
> + * pg_sync_replication_slots().
> + */
> + ereport(ERROR,
> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("replication slot synchronization will stop because promotion
> is triggered"));
>
> The log messages say "will stop", but since sync hasn't started yet,
> "will not start" seems clearer here. For example, "replication slot
> synchronization worker will not start because promotion was triggered"
> and "replication slot synchronization will not start because promotion was
> triggered". Thought?
>
We were using the same log message in two places:
check_and_set_sync_info() and HandleSlotSyncMessage().
I think “will not start” fits better in the first case, while “will
stop” makes sense to keep in the second.
--
Thanks,
Nisha
| Attachment | Content-Type | Size |
|---|---|---|
| v5-0001-Prevent-slotsync-worker-API-hang-during-standby-p.patch | application/octet-stream | 10.2 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Dilip Kumar | 2026-03-30 04:20:54 | Re: Avoid multiple SetLatch() calls in procsignal_sigusr1_handler() |
| Previous Message | Dilip Kumar | 2026-03-30 04:11:33 | Re: Skipping schema changes in publication |