Re: Add errdetail() with PID and UID about source of termination signal

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc: Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add errdetail() with PID and UID about source of termination signal
Date: 2026-04-06 16:51:40
Message-ID: 1ba07c75-cc4e-410c-9af1-076feeeb9594@dunslane.net
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 2026-02-26 Th 4:25 AM, Jakub Wartak wrote:
> On Thu, Feb 26, 2026 at 4:09 AM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>
> Hi Chao,
>
>> I just reviewed v4 again and got a few more comments:
>>
>> 1. This patch only set the global proc_die_sender_pid/uid to 0 at startup, then assign values to them upon receiving SIGTERM, and never reset them, which assumes a process must die upon SIGTERM. Is the assumption true? I guess not. If a process receives SIGTERM and not die immediately, then die for other reason, then it may report a misleading PID and UID.
> Hmm, I'm not sure I follow. If we receive SIGTERM and not die immediately
> (for whatever reason), then two scenarios can happen as far as I'm concerned:
> * another SIGTERM comes in from the same or different uid/pid and it wll be
> reported properly
> * different SIGKILL, but in this case we won't report UID/PID at all
>
> am I missing something or do You have any particular scenario in mind?
> The flow will be wrapper_handler()->die()->SetLatch()->..->directyl to
> err reporting facilities.
>
>> 2.
>> syncrpe.c uses errhint to print PID and UID, and postgres.c uses errdetail. We should keep consistency, maybe all use errhint.
> Right, let's make it that way.
>
>> 3.
>> ```
>> @@ -319,7 +323,11 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
>> QueryCancelPending = false;
>> ereport(WARNING,
>> (errmsg("canceling wait for synchronous replication due to user request"),
>> - errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
>> + errdetail("The transaction has already committed locally, but might not have been replicated to the standby."),
>> + proc_die_sender_pid == 0 ? 0 :
>> + errhint("Signal sent by PID %d, UID %d.",
>> + proc_die_sender_pid, proc_die_sender_uid)
>> + ));
>> SyncRepCancelWait();
>> break;
>> }
>> ```
>>
>> I don’t think the query cancel case relates to SIGTERM, so we don’t need to log PID and UID here.
> Right, it was superfluous.
>
> v5 attached.
>

I'd kinda like to sneak this in for pg19, because I think it's useful.
Here's a v6 that changes one or two things:

- changes the globals to sig_atomic_t

- in ProcessInterrupts, copies to local sender_pid/sender_uid, then
zeros the globals before any ereport

- uses errdetail() for all the messages

Plus a few more cosmetic changes like consistent casing.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment Content-Type Size
v6-0001-Add-errdetail-with-PID-and-UID-about-source-of-te.patch text/x-patch 12.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2026-04-06 17:00:00 Re: Changing the state of data checksums in a running cluster
Previous Message Melanie Plageman 2026-04-06 16:50:58 Re: EXPLAIN: showing ReadStream / prefetch stats