Re: Cleaning up historical portability baggage

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Cleaning up historical portability baggage
Date: 2022-08-05 14:37:50
Message-ID: CA+Tgmob_5AUNCzyFGJX6quYSnQnKCHW6DGGJa1noofJqSu+weg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 23, 2022 at 8:23 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> More generally, I'm not exactly convinced that changes like
> this are a readability improvement:
>
> -#ifdef HAVE_SETSID
> +#ifndef WIN32
>
> I'd rather not have the code cluttered with a sea of
> indistinguishable "#ifndef WIN32" tests when some of them could be
> more specific and more mnemonic.

I can see both sides of this issue. On the one hand, if there's a
large chunk of code that's surrounded by #ifndef WIN32, then it might
not be clear to the casual observer that the block of code in question
is working around the lack of setsid() rather than some other
Windows-specific weirdness. Comments can help with that, though. On
the other hand, to me, seeing HAVE_SETSID makes me think that we're
dealing with a configure probe, and that I might have to worry about
UNIX-like systems not having support for that primitive. If it says
WIN32, then I know that's all we're talking about, and that's clearer.

Looking at a HAVE_SETSID specifically, they seem to fall into three
different categories. First, there's a couple of places that look
roughly like this:

#ifdef HAVE_SETSID
if (setsid() < 0)
elog(FATAL, "setsid() failed: %m");
#endif

I don't think that changing this to WIN32 would confuse anybody.
Surely it's obvious that the WIN32 test is about setsid(), because
that's the only code in the block.

Then there are a couple of places that look like this:

/*
* If we have setsid(), signal the backend's whole process
* group
*/
#ifdef HAVE_SETSID
(void) kill(-pid, SIGTERM);
#else
(void) kill(pid, SIGTERM);
#endif

I think it would be clear enough to adopt a WIN32 test here if we also
adjusted the comment, e.g. "All non-Windows systems supported process
groups, and on such systems, we want to signal the entire group." But
I think there might be an even better option. On Windows, kill is
anyway getting defined to pgkill, and our version of pgkill is defined
to return EINVAL if you pass it a negative number. Why don't we just
have it change a negative value into a positive one? Then we can drop
all this conditional logic in the callers, who can just do (void)
kill(-pid, SIGWHATEVER) and we should be fine. On a quick look, it
appears to me that every call site that passes non-constant second
argument to kill() would be happy with this change to pgkill().

Finally, there's this sort of thing:

static void
signal_child(pid_t pid, int signal)
{
if (kill(pid, signal) < 0)
elog(DEBUG3, "kill(%ld,%d) failed: %m", (long) pid, signal);
#ifdef HAVE_SETSID
switch (signal)
{
case SIGINT:
case SIGTERM:
case SIGQUIT:
case SIGSTOP:
case SIGKILL:
if (kill(-pid, signal) < 0)
elog(DEBUG3, "kill(%ld,%d) failed: %m", (long) (-pid), signal);
break;
default:
break;
}
#endif
}

So the logic here says that we should send the signal to the child and
then if the signal is in a certain list and HAVE_SETSID is defined,
also send the same signal to the whole process group. Here HAVE_SETSID
is really just a proxy for whether the operating system has a notion
of process groups and, again, it seems OK to change this to a WIN32
test with proper comments. However, here again, I think there might be
a better option. The comment above this function notes that signalling
the process itself (as opposed to other process group members) is
assumed to not cause any problems, which implies that it's not a
desired behavior. So with the above redefinition of pgkill(), we could
rewrite this function like this:

static void
signal_child(pid_t pid, int signal)
{
switch (signal)
{
case SIGINT:
case SIGTERM:
case SIGQUIT:
case SIGSTOP:
case SIGKILL:
pid = -pid;
break;
default:
break;
}
if (kill(pid, signal) < 0)
elog(DEBUG3, "kill(%ld,%d) failed: %m", (long) pid, signal);
}

Overall, I don't think it's a great idea to keep all of these
HAVE_WHATEVER macros around if the configure tests are gone. It might
be necessary in the short term to make sure we don't regress the
readability of the code, but I think it would be better to come up
with other techniques for keeping the code readable rather than
relying on the names of these vestigial macros as documentation.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-08-05 14:48:51 Re: Cleaning up historical portability baggage
Previous Message Georgios Kokolatos 2022-08-05 14:23:45 Re: Add LZ4 compression in pg_dump