| From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
|---|---|
| To: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
| Cc: | Andreas Karlsson <andreas(at)proxel(dot)se>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Use proc_exit() in WalRcvWaitForStartPosition |
| Date: | 2026-04-10 06:16:37 |
| Message-ID: | CAHGQGwHHrDv8Z=D-UP+-RUR2yntv3Ab=yw4_uPWEMY0vLX_O6g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Apr 9, 2026 at 10:09 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> On Thu, Apr 9, 2026 at 5:00 AM Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
> >
> > On 4/8/26 11:08 AM, Chao Li wrote:
> > > While working on another patch, I happened to notice that WalRcvWaitForStartPosition() calls raw exit(1). I think this should use proc_exit(1) instead, so that the normal cleanup machinery is not bypassed.
> > >
> > > This tiny patch just replaces exit(1) with proc_exit(1) in WalRcvWaitForStartPosition().
> >
> > This looks likely to be correct since when we exit in WalReceiverMain()
> > (on WALRCV_STOPPING and WALRCV_STOPPED) we call proc_exit(1). I feel we
> > should exit the same way in WalRcvWaitForStartPosition() as we do in
> > WalReceiverMain() and if not I would like a comment explaining why those
> > two cases are different.
>
> +1
+1
> WalRcvWaitForStartPosition, WALRCV_STOPPING before entering wait loop
> uses proc_exit(0) for WALRCV_STOPPING, while this path should probably
> use proc_exit(0) as well (not proc_exit(1)), since the stop was a
> requested shutdown, not an error. Using exit code 1 for a clean
> stop-on-request seems inconsistent.
The requested shutdown is handled in ShutdownWalRcv(), which sets the state to
WALRCV_STOPPING and sends SIGTERM to the walreceiver.
Although this might be considered a normal shutdown (suggesting exit code 0),
when the walreceiver receives SIGTERM it exits via ereport(FATAL), resulting
in exit code 1. In contrast, if it exits early in WalRcvWaitForStartPosition()
due to the WALRCV_STOPPING state, it uses exit code 0, as you noted. So
there seems to be some inconsistency in exit codes.
That said, the exit code (0 vs 1) does not affect behavior, since
the postmaster treats both as non-crash exits.
For consistency, I would prefer using exit code 1 in proc_exit() in
WalRcvWaitForStartPosition(), to match the ereport(FATAL) path. But I'm fine
with other approaches as well.
Also, the comment at the top of walreceiver.c may need updating:
* Normal termination is by SIGTERM, which instructs the walreceiver to
* exit(0). Emergency termination is by SIGQUIT; like any postmaster child
* process, the walreceiver will simply abort and exit on SIGQUIT. A close
* of the connection and a FATAL error are treated not as a crash but as
* normal operation.
Regards,
--
Fujii Masao
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2026-04-10 06:18:10 | Re: Add errdetail() with PID and UID about source of termination signal |
| Previous Message | Xuneng Zhou | 2026-04-10 06:13:05 | Re: Implement waiting for wal lsn replay: reloaded |