Re: Using WaitEventSet in the postmaster

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using WaitEventSet in the postmaster
Date: 2022-12-05 18:09:39
Message-ID: 20221205180939.2p6zcrtevukgyay7@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-12-05 22:45:57 +1300, Thomas Munro wrote:
> On Sat, Dec 3, 2022 at 10:41 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > Here's an iteration like that. Still WIP grade. It passes, but there
> > must be something I don't understand about this computer program yet,
> > because if I move the "if (pending_..." section up into the block
> > where WL_LATCH_SET has arrived (instead of testing those variables
> > every time through the loop), a couple of tests leave zombie
> > (unreaped) processes behind, indicating that something funky happened
> > to the state machine that I haven't yet grokked. Will look more next
> > week.
>
> Duh. The reason for that was the pre-existing special case for
> PM_WAIT_DEAD_END, which used a sleep(100ms) loop to wait for children
> to exit, which I needed to change to a latch wait. Fixed in the next
> iteration, attached.
>
> The reason for the existing sleep-based approach was that we didn't
> want to accept any more connections (or spin furiously because the
> listen queue was non-empty). So in this version I invented a way to
> suppress socket events temporarily with WL_SOCKET_IGNORE, and then
> reactivate them after crash reinit.

That seems like an odd special flag. Why do we need it? Is it just because we
want to have assertions ensuring that something is being queried?

> * WL_SOCKET_ACCEPT is a new event for an incoming connection (on Unix,
> this is just another name for WL_SOCKET_READABLE, but Window has a
> different underlying event; this mirrors WL_SOCKET_CONNECTED on the
> other end of a connection)

Perhaps worth committing separately and soon? Seems pretty uncontroversial
from here.

> +/*
> + * Object representing the state of a postmaster.
> + *
> + * XXX Lots of global variables could move in here.
> + */
> +typedef struct
> +{
> + WaitEventSet *wes;
> +} Postmaster;
> +

Seems weird to introduce this but then basically have it be unused. I'd say
either have a preceding patch move at least a few members into it, or just
omit it for now.

> + /* This may configure SIGURG, depending on platform. */
> + InitializeLatchSupport();
> + InitLocalLatch();

I'm mildly preferring InitProcessLocalLatch(), but not sure why - there's not
really a conflicting meaning of "local" here.

> +/*
> + * Initialize the WaitEventSet we'll use in our main event loop.
> + */
> +static void
> +InitializeWaitSet(Postmaster *postmaster)
> +{
> + /* Set up a WaitEventSet for our latch and listening sockets. */
> + postmaster->wes = CreateWaitEventSet(CurrentMemoryContext, 1 + MAXLISTEN);
> + AddWaitEventToSet(postmaster->wes, WL_LATCH_SET, PGINVALID_SOCKET, MyLatch, NULL);
> + for (int i = 0; i < MAXLISTEN; i++)
> + {
> + int fd = ListenSocket[i];
> +
> + if (fd == PGINVALID_SOCKET)
> + break;
> + AddWaitEventToSet(postmaster->wes, WL_SOCKET_ACCEPT, fd, NULL, NULL);
> + }
> +}

The naming seems like it could be confused with latch.h
infrastructure. InitPostmasterWaitSet()?

> +/*
> + * Activate or deactivate the server socket events.
> + */
> +static void
> +AdjustServerSocketEvents(Postmaster *postmaster, bool active)
> +{
> + for (int pos = 1; pos < GetNumRegisteredWaitEvents(postmaster->wes); ++pos)
> + ModifyWaitEvent(postmaster->wes,
> + pos, active ? WL_SOCKET_ACCEPT : WL_SOCKET_IGNORE,
> + NULL);
> +}

This seems to hardcode the specific wait events we're waiting for based on
latch.c infrastructure. Not really convinced that's a good idea.

> + /*
> + * Latch set by signal handler, or new connection pending on any of our
> + * sockets? If the latter, fork a child process to deal with it.
> + */
> + for (int i = 0; i < nevents; i++)
> {
> - if (errno != EINTR && errno != EWOULDBLOCK)
> + if (events[i].events & WL_LATCH_SET)
> {
> - ereport(LOG,
> - (errcode_for_socket_access(),
> - errmsg("select() failed in postmaster: %m")));
> - return STATUS_ERROR;
> + ResetLatch(MyLatch);
> +
> + /* Process work scheduled by signal handlers. */
> + if (pending_action_request)
> + process_action_request(postmaster);
> + if (pending_child_exit)
> + process_child_exit(postmaster);
> + if (pending_reload_request)
> + process_reload_request();
> + if (pending_shutdown_request)
> + process_shutdown_request(postmaster);
> }

Is the order of operations here quite right? Shouldn't we process a shutdown
request before the others? And a child exit before the request to start an
autovac worker etc?

ISTM it should be 1) shutdown request 2) child exit 3) config reload 4) action
request.

> /*
> - * pmdie -- signal handler for processing various postmaster signals.
> + * pg_ctl uses SIGTERM, SIGINT and SIGQUIT to request different types of
> + * shutdown.
> */
> static void
> -pmdie(SIGNAL_ARGS)
> +handle_shutdown_request_signal(SIGNAL_ARGS)
> {
> - int save_errno = errno;
> -
> - ereport(DEBUG2,
> - (errmsg_internal("postmaster received signal %d",
> - postgres_signal_arg)));
> + int save_errno = errno;
>
> switch (postgres_signal_arg)
> {
> case SIGTERM:
> + pending_shutdown_request = SmartShutdown;
> + break;
> + case SIGINT:
> + pending_shutdown_request = FastShutdown;
> + break;
> + case SIGQUIT:
> + pending_shutdown_request = ImmediateShutdown;
> + break;
> + }
> + SetLatch(MyLatch);
> +
> + errno = save_errno;
> +}

Hm, not having the "postmaster received signal" message anymore seems like a
loss when debugging things. I think process_shutdown_request() should emit
something like it.

I wonder if we should have a elog_sighand() that's written to be signal
safe. I've written versions of that numerous times for debugging, and it's a
bit silly to do that over and over again.

> @@ -2905,23 +2926,33 @@ pmdie(SIGNAL_ARGS)
> * Now wait for backends to exit. If there are none,
> * PostmasterStateMachine will take the next step.
> */
> - PostmasterStateMachine();
> + PostmasterStateMachine(postmaster);
> break;

I'm by now fairly certain that it's a bad idea to have this change mixed in
with the rest of this large-ish change.

> static void
> -PostmasterStateMachine(void)
> +PostmasterStateMachine(Postmaster *postmaster)
> {
> /* If we're doing a smart shutdown, try to advance that state. */
> if (pmState == PM_RUN || pmState == PM_HOT_STANDBY)
> @@ -3819,6 +3849,9 @@ PostmasterStateMachine(void)
> Assert(AutoVacPID == 0);
> /* syslogger is not considered here */
> pmState = PM_NO_CHILDREN;
> +
> + /* re-activate server socket events */
> + AdjustServerSocketEvents(postmaster, true);
> }
> }
>
> @@ -3905,6 +3938,9 @@ PostmasterStateMachine(void)
> pmState = PM_STARTUP;
> /* crash recovery started, reset SIGKILL flag */
> AbortStartTime = 0;
> +
> + /* start accepting server socket connection events again */
> + reenable_server_socket_events = true;
> }
> }

I don't think reenable_server_socket_events does anything as the patch stands
- I don't see it being checked anywhere? And in the path above, you're using
AdjustServerSocketEvents() directly.

> @@ -4094,6 +4130,7 @@ BackendStartup(Port *port)
> /* Hasn't asked to be notified about any bgworkers yet */
> bn->bgworker_notify = false;
>
> + PG_SETMASK(&BlockSig);
> #ifdef EXEC_BACKEND
> pid = backend_forkexec(port);
> #else /* !EXEC_BACKEND */

There are other calls to fork_process() - why don't they need the same
treatment?

Perhaps we should add an assertion to fork_process() ensuring that signals are
masked?

> diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
> index eb3a569aae..3bfef592eb 100644
> --- a/src/backend/storage/ipc/latch.c
> +++ b/src/backend/storage/ipc/latch.c
> @@ -283,6 +283,17 @@ InitializeLatchSupport(void)
> #ifdef WAIT_USE_SIGNALFD
> sigset_t signalfd_mask;
>
> + if (IsUnderPostmaster)
> + {
> + if (signal_fd != -1)
> + {
> + /* Release postmaster's signal FD; ignore any error */
> + (void) close(signal_fd);
> + signal_fd = -1;
> + ReleaseExternalFD();
> + }
> + }
> +

Hm - arguably it's a bug that we don't do this right now, correct?

> @@ -1201,6 +1214,7 @@ WaitEventAdjustKqueue(WaitEventSet *set, WaitEvent *event, int old_events)
> event->events == WL_POSTMASTER_DEATH ||
> (event->events & (WL_SOCKET_READABLE |
> WL_SOCKET_WRITEABLE |
> + WL_SOCKET_IGNORE |
> WL_SOCKET_CLOSED)));
>
> if (event->events == WL_POSTMASTER_DEATH)
> @@ -1312,6 +1326,8 @@ WaitEventAdjustWin32(WaitEventSet *set, WaitEvent *event)
> flags |= FD_WRITE;
> if (event->events & WL_SOCKET_CONNECTED)
> flags |= FD_CONNECT;
> + if (event->events & WL_SOCKET_ACCEPT)
> + flags |= FD_ACCEPT;
>
> if (*handle == WSA_INVALID_EVENT)
> {

I wonder if the code would end up easier to understand if we handled
WL_SOCKET_CONNECTED, WL_SOCKET_ACCEPT explicitly in the !WIN32 cases, rather
than redefining it to WL_SOCKET_READABLE.

> diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
> index 3082093d1e..655e881688 100644
> --- a/src/backend/tcop/postgres.c
> +++ b/src/backend/tcop/postgres.c
> @@ -24,7 +24,6 @@
> #include <signal.h>
> #include <unistd.h>
> #include <sys/resource.h>
> -#include <sys/select.h>
> #include <sys/socket.h>
> #include <sys/time.h>

Do you know why this include even existed?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-12-05 18:14:49 Re: [PATCH] Add native windows on arm64 support
Previous Message Robert Haas 2022-12-05 18:06:53 Re: ANY_VALUE aggregate