Re: Add errdetail() with PID and UID about source of termination signal

From: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
To: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add errdetail() with PID and UID about source of termination signal
Date: 2026-02-26 09:25:47
Message-ID: CAKZiRmxvdYXhO1JrVLW3LkuoaTjYpKmhOzzOCOhw4wDM3rYRMQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 26, 2026 at 4:09 AM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:

Hi Chao,

> I just reviewed v4 again and got a few more comments:
>
> 1. This patch only set the global proc_die_sender_pid/uid to 0 at startup, then assign values to them upon receiving SIGTERM, and never reset them, which assumes a process must die upon SIGTERM. Is the assumption true? I guess not. If a process receives SIGTERM and not die immediately, then die for other reason, then it may report a misleading PID and UID.

Hmm, I'm not sure I follow. If we receive SIGTERM and not die immediately
(for whatever reason), then two scenarios can happen as far as I'm concerned:
* another SIGTERM comes in from the same or different uid/pid and it wll be
reported properly
* different SIGKILL, but in this case we won't report UID/PID at all

am I missing something or do You have any particular scenario in mind?
The flow will be wrapper_handler()->die()->SetLatch()->..->directyl to
err reporting facilities.

> 2.
> syncrpe.c uses errhint to print PID and UID, and postgres.c uses errdetail. We should keep consistency, maybe all use errhint.

Right, let's make it that way.

> 3.
> ```
> @@ -319,7 +323,11 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
> QueryCancelPending = false;
> ereport(WARNING,
> (errmsg("canceling wait for synchronous replication due to user request"),
> - errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
> + errdetail("The transaction has already committed locally, but might not have been replicated to the standby."),
> + proc_die_sender_pid == 0 ? 0 :
> + errhint("Signal sent by PID %d, UID %d.",
> + proc_die_sender_pid, proc_die_sender_uid)
> + ));
> SyncRepCancelWait();
> break;
> }
> ```
>
> I don’t think the query cancel case relates to SIGTERM, so we don’t need to log PID and UID here.

Right, it was superfluous.

v5 attached.

-J.

Attachment Content-Type Size
v5-0001-Add-errdetail-with-PID-and-UID-about-source-of-te.patch text/x-patch 12.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Kukushkin 2026-02-26 09:29:29 Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication
Previous Message Chao Li 2026-02-26 09:21:13 Re: guc: make dereference style consistent in check_backtrace_functions