Re: 9.4 HEAD: select() failed in postmaster

From: Noah Misch <noah(at)leadboat(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, alvherre(at)2ndquadrant(dot)com
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 9.4 HEAD: select() failed in postmaster
Date: 2013-09-11 23:54:19
Message-ID: 20130911235419.GA243762@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 10, 2013 at 05:18:21PM -0700, Jeff Janes wrote:
> I've been getting some failures after an immediate shutdown or crash,
> during severe IO stress, with the message:
>
> LOG: XX000: select() failed in postmaster: Invalid argument
> LOCATION: ServerLoop, postmaster.c:1560
>
> It is trying to sleep for -1 seconds.
>
> I think the problem is here, where there should be a Max rather than a Min:
>
> commit 82233ce7ea42d6ba519aaec63008aff49da6c7af
> Author: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
> Date: Fri Jun 28 17:20:53 2013 -0400
>
> Send SIGKILL to children if they don't die quickly in immediate shutdown
>
> ...
>
> + /* remaining time, but at least 1 second */
> + timeout->tv_sec = Min(SIGKILL_CHILDREN_AFTER_SECS -
> + (time(NULL) - AbortStartTime), 1);

Agreed; good catch.

> But I don't understand the logic behind this anyway. Why sleep at least 1
> second? If time is up, it is up, why not use zero as the minimum?

Offhand, clamping to zero does make more sense to me. It looks like Alvaro
added that bit in his pre-commit edits. Alvaro?

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2013-09-12 00:35:09 Re: Pending query cancel defeats SIGQUIT
Previous Message David Fetter 2013-09-11 23:10:22 Re: Protocol forced to V2 in low-memory conditions?