Skip site navigation (1) Skip section navigation (2)

Re: Unportable implementation of background worker start

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unportable implementation of background worker start
Date: 2017-04-24 20:16:44
Message-ID: 20170424201644.ln5ec3haynv7eykl@alap3.anarazel.de (view raw, whole thread or download thread mbox)
Thread:
Lists: pgsql-hackers
On 2017-04-21 23:50:41 -0400, Tom Lane wrote:
> I wrote:
> > Attached is a lightly-tested draft patch that converts the postmaster to
> > use a WaitEventSet for waiting in ServerLoop.  I've got mixed emotions
> > about whether this is the direction to proceed, though.
> 
> Attached are a couple of patches that represent a plausible Plan B.
> The first one changes the postmaster to run its signal handlers without
> specifying SA_RESTART.  I've confirmed that that seems to fix the
> select_parallel-test-takes-a-long-time problem on gaur/pademelon.
> The second one uses pselect, if available, to replace the unblock-signals/
> select()/block-signals dance in ServerLoop.  On platforms where pselect
> exists and works properly, that should fix the race condition I described
> previously.  On platforms where it doesn't, we're no worse off than
> before.
> 
> As mentioned in the comments for the second patch, even if we don't
> have working pselect(), the only problem is that ServerLoop's response
> to an interrupt might be delayed by as much as the up-to-1-minute timeout.
> The only existing case where that's really bad is launching multiple
> bgworkers.  I would therefore advocate also changing maybe_start_bgworker
> to start up to N bgworkers per call, where N is large enough to pretty
> much always satisfy simultaneously-arriving requests.  I'd pick 100 or
> so, but am willing to negotiate.
> 
> I think that these patches represent something we could back-patch
> without a lot of trepidation, unlike the WaitEventSet-based approach.
> Therefore, my proposal is to apply and backpatch these changes, and
> call it good for v10.  For v11, we could work on changing the postmaster
> to not do work in signal handlers, as discussed upthread.  That would
> supersede these two patches completely, though I'd still advocate for
> keeping the change in maybe_start_bgworker.
> 
> Note: for testing purposes, these patches are quite independent; just
> ignore the hunk in the second patch that changes a comment added by
> the first one.

Unclear if related, but
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2017-04-24%2019%3A30%3A42
has a suspicious timing of failing in a weird way.

- Andres


In response to

Responses

pgsql-hackers by date

Next:From: Robert HaasDate: 2017-04-24 20:18:30
Subject: Re: Patch - Tcl 8.6 version support for PostgreSQL
Previous:From: Robert HaasDate: 2017-04-24 20:16:36
Subject: Re: Adding support for Default partition in partitioning

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group