| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Add errdetail() with PID and UID about source of termination signal |
| Date: | 2026-04-09 06:14:23 |
| Message-ID: | 95D4DA4E-0756-419E-9D2F-5AAF14A1D30C@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On Apr 9, 2026, at 11:11, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>
>
>
>> On Apr 9, 2026, at 01:01, Andres Freund <andres(at)anarazel(dot)de> wrote:
>>
>> Hi,
>>
>> Attached is a very rough first draft for how I think this needs to look like.
>>
>> Basically, SIGNAL_INFO always will pass both the signal number and extended
>> information along to the signal handler. The extended information is a
>> postgres specific struct. If the platform can't provide the extended
>> information, the values are instead set to some default value indicating that
>> the information is not known.
>>
>> With that die() (and also StatementCancelHandler, ...) can just set whatever
>> globals it wants, without pqsignal.c needing to know about it.
>>
>> It also allows us to extend the amount of information in the future. E.g. I'd
>> like to log the reason for a segfault (could e.g. be an OOM kill or an umapped
>> region) to stderr.
>>
>> The annoying thing about it is needing to change nearly all the existing
>> references to SIG_IGN/SIG_DFL, to avoid warnings due to mismatched types.
>>
>>
>> On 2026-04-08 11:13:40 +0200, Jim Jones wrote:
>>> If I understood this thread correctly, the feature (v6) introduced a
>>> problematic dual signature for wrapper_handler in pgsignal.c:
>>
>>> +#if !defined(FRONTEND) && defined(HAVE_SA_SIGINFO)
>>> +static void
>>> +wrapper_handler(int signo, siginfo_t * info, void *context)
>>> +#else
>>> static void
>>> wrapper_handler(SIGNAL_ARGS)
>>> +#endif
>>
>> I think it's not a problem for wrapper_handler to change its signature, that's
>> a local implementation detail. The problem is that the way the arguments were
>> passed was just wrong. Because signal handlers can be nested and such
>> nastiness, doing any of that via global variables is a recipe for disaster.
>> It's also just ugly.
>>
>>
>>> .. and it should be rather done in the source (c.h) where SIGNAL_ARGS is
>>> defined:
>>>
>>> #ifndef SIGNAL_ARGS
>>> #define SIGNAL_ARGS int postgres_signal_arg
>>> #endif
>>>
>>> Something like:
>>>
>>> #ifndef SIGNAL_ARGS
>>> #ifdef HAVE_SA_SIGINFO
>>> #define SIGNAL_ARGS int postgres_signal_arg, siginfo_t
>>> *postgres_signal_info, void *postgres_signal_context
>>> #else
>>> #define SIGNAL_ARGS int postgres_signal_arg
>>> #endif
>>> #endif
>>>
>>> But wouldn't it mean that all handlers need to be updated as well, since
>>> they'd get new parameters?
>>
>> All the signal handlers actually use SIGNAL_ARGS themselves, so that part is
>> not a problem.
>>
>> However, if we did it like you sketch above, they'd all need ifdefs etc to be
>> able to access any extended information, which seems like a terrible
>> idea. Especially if we want this information on multiple platforms, where the
>> struct to be passed would differ.
>>
>> Hence in my prototype it's hidden behind a platform indepenedent struct of our
>> own.
>>
>>
>>> If this is the case, the change can be quite substantial.
>>
>> It's a bit annoying to do all the s/\b(SIG_(IGN|DFL)/PG_$1/, but it's not that
>> bad, I think?
>>
>> I unfortunately don't see a good other way to deal with it. We could have a
>> macro wrapper around pqsignal() that checks for SIG_IGN with a cast to the
>> system type, but that seems exceedingly ugly.
>>
>> Greetings,
>>
>> Andres Freund
>> <v1-0001-WIP-Support-for-extended-information-about-signal.patch>
>
> I think the core idea here is to add a new parameter so signal handlers can receive “pg_signal_info". Compared to my earlier proposal, which stored sender info in file-scope variables, I agree this solution is more flexible. I think my earlier direction was mainly trying to avoid changing the signal handler interface.
>
> So I have no objection to the overall direction. Since the patch is marked as WIP, I didn't review all the details yet. But one thing I want to point out is:
> ```
> +typedef struct pg_signal_info
> +{
> + pid_t pid; /* pid of sending process or 0 if unknown */
> + uid_t uid; /* uid of sending process or 0 if unknown */
> +} pg_signal_info;
> ```
>
> For uid, 0 is usually a valid value for root. So using 0 as the “unknown” value seems a bit awkward. Maybe we should instead document something like "uid is only meaningful when pid is not 0".
>
Forgot to mention, I got a lot of compile warnings, for example:
```
parallel.c:1052:20: warning: cast from 'void (*)(int)' to 'pqsigfunc' (aka 'void (*)(int, struct pg_signal_info *)') converts to incompatible function type [-Wcast-function-type-mismatch]
1052 | pqsignal(SIGPIPE, PG_SIG_IGN);
| ^~~~~~~~~~
../../../src/include/port.h:551:20: note: expanded from macro 'PG_SIG_IGN'
551 | #define PG_SIG_IGN (pqsigfunc) SIG_IGN
| ^~~~~~~~~~~~~~~~~~~
4 warnings generated.
```
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Mohamed ALi | 2026-04-09 06:16:48 | [PATCH] Fix: Partitioned parent index remains invalid after child indexes are repaired |
| Previous Message | wang.xiao.peng | 2026-04-09 06:12:39 | Re:Re: pg_test_timing: fix unit typo and widen diff type |