Re: Using WaitEventSet in the postmaster

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using WaitEventSet in the postmaster
Date: 2023-01-07 05:08:11
Message-ID: CA+hUKGLPZ8NUPfwAfi6Gd0y4CRSzBCTeAy7irE2B=-YsxAO3fQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 7, 2023 at 12:25 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2023-01-07 11:08:36 +1300, Thomas Munro wrote:
> > 3. Is it OK to clobber the shared pending flag for SIGQUIT, SIGTERM,
> > SIGINT? If you send all of these extremely rapidly, it's
> > indeterminate which one will be seen by handle_shutdown_request().
>
> That doesn't seem optimal. I'm mostly worried that we can end up downgrading a
> shutdown request.

I was contemplating whether I needed to do some more push-ups to
prefer the first delivered signal (instead of the last), but you're
saying that it would be enough to prefer the fastest shutdown type, in
cases where more than one signal was handled between server loops.
WFM.

> It's not pretty, but to me the easiest fix here
> seems to be have separate pending_{fast,smart,immediate}_shutdown_request
> variables and deal with them in process_shutdown_request(). Might still make
> sense to have one pending_shutdown_request variable, to avoid unnecessary
> branches before calling process_shutdown_request().

OK, tried that way.

> > 5. Is the signal mask being correctly handled during forking? I
> > think so: I decided to push the masking logic directly into the
> > routine that forks, to make it easy to verify that all paths set the
> > mask the way we want.
>
> Hm. If I understand correctly, you used sigprocmask() directly (vs
> PG_SETMASK()) in fork_process() because you want the old mask? But why do we
> restore the prior mask, instead of just using PG_SETMASK(&UnBlockSig); as we
> still do in a bunch of places in the postmaster?

It makes zero difference in practice but I think it's a nicer way to
code it because it doesn't make an unnecessary assumption about what
the signal mask was on entry.

> Not that I'm advocating for that, but would there be any real harm in just
> continuing to accept signals post fork? Now all the signal handlers should
> just end up pointlessly setting a local variable that's not going to be read
> any further? If true, it'd be good to add a comment explaining that this is
> just a belt-and-suspenders thing.

Seems plausible and a nice idea to research. I think it might take
some analysis of important signals that children might miss before
they install their own handlers. Comment added.

> > 6. Is the naming and style OK? Better ideas welcome, but basically I
> > tried to avoid all unnecessary refactoring and changes, so no real
> > logic moves around, and the changes are pretty close to "mechanical".
> > One bikeshed decision was what to call the {handle,process}_XXX
> > functions and associated flags.
>
> I wonder if it'd be good to have a _pm_ in the name.

I dunno about this one, it's all static stuff in a file called
postmaster.c and one (now) already has pm in it (see below).

> > Maybe "action" isn't the best name;
>
> Yea, I don't like it. A shutdown is also an action, etc. What about just using
> _pmsignal_? It's a it odd because there's two signals in the name, but it
> still feels better than 'action' and better than the existing sigusr1_handler.

Done.

> > +
> > +/* I/O multiplexing object */
> > +static WaitEventSet *wait_set;
>
> I'd name it a bit more obviously connected to postmaster, particularly because
> it does survive into forked processes and needs to be closed there.

Done, as pm_wait_set.

> > +/*
> > + * Activate or deactivate notifications of server socket events. Since we
> > + * don't currently have a way to remove events from an existing WaitEventSet,
> > + * we'll just destroy and recreate the whole thing. This is called during
> > + * shutdown so we can wait for backends to exit without accepting new
> > + * connections, and during crash reinitialization when we need to start
> > + * listening for new connections again.
> > + */
>
> I'd maybe reference that this gets cleaned up via ClosePostmasterPorts(), it's
> not *immediately* obvious.

Done.

> > +static void
> > +ConfigurePostmasterWaitSet(bool accept_connections)
> > +{
> > + if (wait_set)
> > + FreeWaitEventSet(wait_set);
> > + wait_set = NULL;
> > +
> > + wait_set = CreateWaitEventSet(CurrentMemoryContext, 1 + MAXLISTEN);
> > + AddWaitEventToSet(wait_set, WL_LATCH_SET, PGINVALID_SOCKET, MyLatch, NULL);
>
> Is there any reason to use MAXLISTEN here? We know how many sockets w're
> listening on by this point, I think? No idea if the overhead matters anywhere,
> but ...

Fixed.

> > /*
> > - * New connection pending on any of our sockets? If so, fork a child
> > - * process to deal with it.
> > + * Latch set by signal handler, or new connection pending on any of
> > + * our sockets? If the latter, fork a child process to deal with it.
> > */
> > - if (selres > 0)
> > + for (int i = 0; i < nevents; i++)
> > {
>
> Hm. This is preexisting behaviour, but now it seems somewhat odd that we might
> end up happily forking a backend for each socket without checking signals
> inbetween. Forking might take a while, and if a signal arrived since the
> WaitEventSetWait() we'll not react to it.

Right, so if you have 64 server sockets and they all have clients
waiting on their listen queue, we'll service one connection from each
before getting back to checking for pmsignals or shutdown, and that's
unchanged in this patch. (FWIW I also noticed that problem while
experimenting with the idea that you could handle multiple clients in
one go on OSes that report the listen queue depth size along with
WL_SOCKET_ACCEPT events, but didn't pursue it...)

I guess we could check every time through the nevents loop. I may
look into that in a later patch, but I prefer to keep the same policy
in this patch.

> > static void
> > PostmasterStateMachine(void)
> > @@ -3796,6 +3819,9 @@ PostmasterStateMachine(void)
> >
> > if (pmState == PM_WAIT_DEAD_END)
> > {
> > + /* Don't allow any new socket connection events. */
> > + ConfigurePostmasterWaitSet(false);
>
> Hm. Is anything actually using the wait set until we re-create it with (true)
> below?

Yes. While in PM_WAIT_DEAD_END state, waiting for children to exit,
there may be clients trying to connect. On master, we have a special
pg_usleep(100000L) instead of select() just for that state so we can
ignore incoming connections while waiting for the SIGCHLD reaper to
advance our state, but in this new world that's not enough. We need
to wait for the latch to be set by handle_child_exit_signal(). So I
used the regular WES to wait for the latch (that is, no more special
case for that state), but I need to ignore socket events. If I
didn't, then an incoming connection sitting in the listen queue would
cause the server loop to burn 100% CPU reporting a level-triggered
WL_SOCKET_ACCEPT for sockets that we don't want to accept (yet).

Attachment Content-Type Size
v9-0001-Give-the-postmaster-a-WaitEventSet-and-a-latch.patch text/x-patch 24.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2023-01-07 05:37:19 Re: Schema variables - new implementation for Postgres 15 (typo)
Previous Message Nathan Bossart 2023-01-07 05:07:24 Re: add PROCESS_MAIN to VACUUM