Skip site navigation (1) Skip section navigation (2)

Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Chris Travers <chris(at)metatrontech(dot)com>, Cristian Bittel <cbittel(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session
Date: 2010-08-24 12:57:34
Message-ID: 201008241257.o7OCvYt12456@momjian.us (view raw or flat)
Thread:
Lists: pgsql-bugspgsql-hackers
Robert Haas wrote:
> [moving to -hackers]
> 
> On Thu, Aug 19, 2010 at 9:43 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > I suspect this is the same problem as bug #4897, and probably also the
> > same problem as this:
> > http://archives.postgresql.org/pgsql-bugs/2009-08/msg00114.php
> >
> > and maybe also this and this:
> > http://archives.postgresql.org/pgsql-bugs/2010-02/msg00179.php
> > http://archives.postgresql.org/pgsql-admin/2009-05/msg00105.php
> >
> > Unfortunately, it seems that no one has been able to get a stack trace yet.
> 
> Bruce pointed out yet another report of this problem to me:
> 
> http://archives.postgresql.org/pgsql-general/2010-08/msg00550.php
> 
> After some discussion with Magnus, I think what is going on here is
> that the postmaster kicks off a new child process, which terminates
> before it actually starts running our code, either in OS-supplied code
> or some sort of "filter" like anti-spam or anti-virus software.  It's
> presumably NOT dying in our code because - at least AFAICS - we don't
> exit(128) anywhere.  One way we could possibly improve the situation
> is to not treat this as a child crash - that is, don't do a
> crash-and-restart cycle; just treat that backend as having done
> elog(FATAL).  The trick is that you need a reliable way to distinguish
> between a regular child crash and an "early" child crash.  Magnus
> suggested perhaps we could create a mutex that the child grabs before
> mapping shared memory; the postmaster could check whether the mutex
> had been taken.  If so, we handle the crash normally; if not, we just
> chalk it up to experience and continue on.
> 
> This isn't really a "fix" for the bug in the sense that the nicest
> thing of all would be to prevent the child from exiting abnormally in
> the first place.  But it's far from clear that we can control that.

This URL has some interesting details on our problem:

	http://stackoverflow.com/questions/139090/getexitcodeprocess-returns-128

Error code 128 is identified as:

	error code 128 RROR_WAIT_NO_CHILDREN 128 0x80 There are no child
	processes to wait for

and the suggested cause is:

	Have a look at Desktop Heap memory.
	
	Essentially the desktop heap issue comes down to exhausted resources (eg
	starting too many processes). When your app runs out of these resources,
	one of the symptoms is that you won't be able to start a new process,
	and the call to CreateProcess will fail with code 128.

My guess is that at the time of CreateProcess(), there is enough desktop
heap memory, but at some later time, perhaps caused by a logout, there
isn't and the process never gets started.

-- 
  Bruce Momjian  <bruce(at)momjian(dot)us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

In response to

Responses

pgsql-hackers by date

Next:From: McGehee, RobertDate: 2010-08-24 13:25:30
Subject: Re: Unable to drop role
Previous:From: Magnus HaganderDate: 2010-08-24 12:11:06
Subject: Re: Fw: patch for pg_ctl.c to add windows service start-type

pgsql-bugs by date

Next:From: Robert HaasDate: 2010-08-24 13:38:43
Subject: Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session
Previous:From: Magnus HaganderDate: 2010-08-24 09:08:48
Subject: Re: BUG #5628: 9.0beta4 failed automatic crash recovery

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group