Re: Unportable implementation of background worker start

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unportable implementation of background worker start
Date: 2017-04-24 20:16:44
Message-ID: 20170424201644.ln5ec3haynv7eykl@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-04-21 23:50:41 -0400, Tom Lane wrote:
> I wrote:
> > Attached is a lightly-tested draft patch that converts the postmaster to
> > use a WaitEventSet for waiting in ServerLoop. I've got mixed emotions
> > about whether this is the direction to proceed, though.
>
> Attached are a couple of patches that represent a plausible Plan B.
> The first one changes the postmaster to run its signal handlers without
> specifying SA_RESTART. I've confirmed that that seems to fix the
> select_parallel-test-takes-a-long-time problem on gaur/pademelon.
> The second one uses pselect, if available, to replace the unblock-signals/
> select()/block-signals dance in ServerLoop. On platforms where pselect
> exists and works properly, that should fix the race condition I described
> previously. On platforms where it doesn't, we're no worse off than
> before.
>
> As mentioned in the comments for the second patch, even if we don't
> have working pselect(), the only problem is that ServerLoop's response
> to an interrupt might be delayed by as much as the up-to-1-minute timeout.
> The only existing case where that's really bad is launching multiple
> bgworkers. I would therefore advocate also changing maybe_start_bgworker
> to start up to N bgworkers per call, where N is large enough to pretty
> much always satisfy simultaneously-arriving requests. I'd pick 100 or
> so, but am willing to negotiate.
>
> I think that these patches represent something we could back-patch
> without a lot of trepidation, unlike the WaitEventSet-based approach.
> Therefore, my proposal is to apply and backpatch these changes, and
> call it good for v10. For v11, we could work on changing the postmaster
> to not do work in signal handlers, as discussed upthread. That would
> supersede these two patches completely, though I'd still advocate for
> keeping the change in maybe_start_bgworker.
>
> Note: for testing purposes, these patches are quite independent; just
> ignore the hunk in the second patch that changes a comment added by
> the first one.

Unclear if related, but
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2017-04-24%2019%3A30%3A42
has a suspicious timing of failing in a weird way.

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-04-24 20:18:30 Re: Patch - Tcl 8.6 version support for PostgreSQL
Previous Message Robert Haas 2017-04-24 20:16:36 Re: Adding support for Default partition in partitioning