| From: | Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> |
|---|---|
| To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Use SIGTERM instead of SIGUSR1 for slotsync worker to exit during promotion? |
| Date: | 2026-03-31 06:04:55 |
| Message-ID: | CABdArM7fPqd9GSXLLyDZfX_bZkAaoJAVDGKSQULfvEvVZZHgsg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Mar 30, 2026 at 4:39 PM Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>
> On Mon, Mar 30, 2026 at 1:18 PM Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
> > We were using the same log message in two places:
> > check_and_set_sync_info() and HandleSlotSyncMessage().
> > I think “will not start” fits better in the first case, while “will
> > stop” makes sense to keep in the second.
>
> Thanks for updating the patch!
>
> With the patch, in my testing, standby promotion always produces
> the following logs:
>
> LOG: replication slot synchronization worker will stop because
> promotion is triggered
> LOG: replication slot synchronization worker will not start
> because promotion was triggered
>
> It looks like the postmaster immediately restarts the slotsync worker after
> promotion terminates it, and that new worker then exits on seeing
> SlotSyncCtx->stopSignaled.
>
> IMO, always emitting both messages is a bit confusing. It would be nice to
> suppress the second one if possible.
>
> One idea would be to prevent the restart altogether. For example,
> ProcessSlotSyncMessage() could set SlotSyncCtx->last_start_time to
> a special value (like -1), and SlotSyncWorkerCanRestart() could return
> false (i.e., prevent postmater from starting up slotsync worker) when
> it sees that. Alternatively, SlotSyncWorkerCanRestart() could simply
> check SlotSyncCtx->stopSignaled.
>
> That said, as far as I remember correctly, postmaster is generally not
> supposed to touch shared memory (per the comments in postmaster.c),
> so I'm not sure this approach is acceptable. On the other hand,
> postmaster and the slotsync worker already rely on SlotSyncCtx->last_start_time,
> so perhaps there's some precedent here.
>
IIUC, checking SlotSyncCtx->stopSignaled in SlotSyncWorkerCanRestart()
may not be ideal, as it requires a spinlock to avoid races with the
startup process and it is disallowed to take lock in postmaster main
loop. Whereas, SlotSyncCtx->last_start_time doesn’t need a lock since
the postmaster accesses it only when the worker is not alive.
Another option could be to log in check_and_set_sync_info() at DEBUG1
instead of LOG level. This message appears only after stopSignaled is
set, when promotion is already in progress and the first worker has
logged “will stop…”. The second worker doesn’t do any real work. Since
there’s nothing actionable for users, using DEBUG1 would keep it
useful for debugging (e.g., noticing immediate restarts) while
avoiding extra log noise. Thoughts?
--
Thanks,
Nisha
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ajay Pal | 2026-03-31 06:14:12 | Re: [PATCH] Add support for INSERT ... SET syntax |
| Previous Message | Nisha Moond | 2026-03-31 06:01:41 | Re: Use SIGTERM instead of SIGUSR1 for slotsync worker to exit during promotion? |