Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Chris Travers <chris(at)metatrontech(dot)com>, Cristian Bittel <cbittel(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session
Date: 2010-08-24 13:38:43
Message-ID: AANLkTikaaAR_g45N8xA5OHN9PjZUa+7AziQJotfvyqvJ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Tue, Aug 24, 2010 at 8:57 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Robert Haas wrote:
>> [moving to -hackers]
>>
>> On Thu, Aug 19, 2010 at 9:43 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> > I suspect this is the same problem as bug #4897, and probably also the
>> > same problem as this:
>> > http://archives.postgresql.org/pgsql-bugs/2009-08/msg00114.php
>> >
>> > and maybe also this and this:
>> > http://archives.postgresql.org/pgsql-bugs/2010-02/msg00179.php
>> > http://archives.postgresql.org/pgsql-admin/2009-05/msg00105.php
>> >
>> > Unfortunately, it seems that no one has been able to get a stack trace yet.
>>
>> Bruce pointed out yet another report of this problem to me:
>>
>> http://archives.postgresql.org/pgsql-general/2010-08/msg00550.php
>>
>> After some discussion with Magnus, I think what is going on here is
>> that the postmaster kicks off a new child process, which terminates
>> before it actually starts running our code, either in OS-supplied code
>> or some sort of "filter" like anti-spam or anti-virus software.  It's
>> presumably NOT dying in our code because - at least AFAICS - we don't
>> exit(128) anywhere.  One way we could possibly improve the situation
>> is to not treat this as a child crash - that is, don't do a
>> crash-and-restart cycle; just treat that backend as having done
>> elog(FATAL).  The trick is that you need a reliable way to distinguish
>> between a regular child crash and an "early" child crash.  Magnus
>> suggested perhaps we could create a mutex that the child grabs before
>> mapping shared memory; the postmaster could check whether the mutex
>> had been taken.  If so, we handle the crash normally; if not, we just
>> chalk it up to experience and continue on.
>>
>> This isn't really a "fix" for the bug in the sense that the nicest
>> thing of all would be to prevent the child from exiting abnormally in
>> the first place.  But it's far from clear that we can control that.
>
> This URL has some interesting details on our problem:
>
>        http://stackoverflow.com/questions/139090/getexitcodeprocess-returns-128
>
> Error code 128 is identified as:
>
>        error code 128 RROR_WAIT_NO_CHILDREN 128 0x80 There are no child
>        processes to wait for
>
> and the suggested cause is:
>
>        Have a look at Desktop Heap memory.
>
>        Essentially the desktop heap issue comes down to exhausted resources (eg
>        starting too many processes). When your app runs out of these resources,
>        one of the symptoms is that you won't be able to start a new process,
>        and the call to CreateProcess will fail with code 128.
>
> My guess is that at the time of CreateProcess(), there is enough desktop
> heap memory, but at some later time, perhaps caused by a logout, there
> isn't and the process never gets started.

Yeah, that seems very plausible, although exactly how to verify I don't know.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Bruce Momjian 2010-08-24 13:43:00 Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session
Previous Message Bruce Momjian 2010-08-24 12:57:34 Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2010-08-24 13:43:00 Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session
Previous Message Tom Lane 2010-08-24 13:36:05 Re: Unable to drop role