Re: WL_SOCKET_ACCEPT fairness on Windows

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WL_SOCKET_ACCEPT fairness on Windows
Date: 2023-04-01 03:00:04
Message-ID: CA+hUKGKEOhqQKHzRLux0dfHs4eNcq0rr_1q=U4GB44+UytYeuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 1, 2023 at 2:35 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I wonder if we ought to bite the bullet and replace the use of
> WaitForMultipleObjects() with RegisterWaitForSingleObject() and then use
> GetQueuedCompletionStatus() to wait. The fairness issue here is a motivation,
> but the bigger one is that that'd get us out from under the
> MAXIMUM_WAIT_OBJECTS (IIRC 64) limit. Afaict that'd also allow us to read
> multiple notifications at once, using GetQueuedCompletionStatusEx().

Interesting. So a callback would run in a system-managed thread, and
that would post a custom message in an IOCP for us to read, kinda like
the fake waitpid() thing? Seems a bit gross and context-switchy but I
agree that the 64 event limit is also terrible.

> Medium term that'd also be a small step towards using readiness based APIs in
> a few places...

Yeah, that would be cool.

> > I think we could get the same effect as pgwin32_select() more cheaply
> > by doing the initial WaitForMultipleEvents() call with the caller's
> > timeout exactly as we do today, and then, while we have space,
> > repeatedly calling
> > WaitForMultipleEvents(handles=&events[last_signaled_event_index + 1],
> > timeout=0) until it reports timeout.
>
> That would make sense, and should indeed be reasonable cost-wise.

Cool.

> > I mention this now because I'm not sure whether to consider this an
> > 'open item' for 16, or merely an enhancement for 17. I guess the
> > former, because someone might call that a new denial of service
> > vector. On the other hand, if you fill up the listen queue for socket
> > 1 with enough vigour, you're also denying service to socket 1, so I
> > don't know if it's worth worrying about. Opinions on that?
>
> I'm not sure either. It doesn't strike me as a particularly relevant
> bottleneck. And the old approach of doing more work for every single
> connection also made many connections worse, I think?

Alright, let's see if anyone else thinks this is worth fixing for 16.

> > diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
> > index f4123e7de7..cc7b572008 100644
> > --- a/src/backend/storage/ipc/latch.c
> > +++ b/src/backend/storage/ipc/latch.c
> > @@ -2025,6 +2025,8 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
> > */
> > cur_event = (WaitEvent *) &set->events[rc - WAIT_OBJECT_0 - 1];
> >
> > +loop:
> > +
>
> As far as I can tell, we'll never see WL_LATCH_SET or WL_POSTMASTER_DEATH. I
> think it'd look cleaner to move the body of if (cur_event->events & WL_SOCKET_MASK)
> into a separate function that we then also can call further down.

We could see them, AFAICS, and I don't see much advantage in making
that assumption? Shouldn't we just shove it in a loop, just like the
other platforms' implementations? Done in this version, which is best
viewed with git show --ignore-all-space.

> Seems like we could use returned_events to get nevents in the way you want it,
> without adding even more ++/-- to each of the different events?

Yeah. This time I use reported_events. I also fixed a maths failure:
I'd forgotten to use rc - WAIT_OBJECT_0, suggesting that CI never
reached the code.

Attachment Content-Type Size
v2-0001-Teach-WaitEventSetWait-to-report-multiple-events-.patch application/x-patch 9.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2023-04-01 04:02:47 Re: regression coverage gaps for gist and hash indexes
Previous Message Andres Freund 2023-04-01 01:35:29 Re: WL_SOCKET_ACCEPT fairness on Windows