Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Chris Travers <chris(at)metatrontech(dot)com>, Cristian Bittel <cbittel(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session
Date: 2010-09-10 07:46:23
Message-ID: AANLkTikBzy3ZcHfj-KW=0Vp9eGc2-4gDvc9fMWpfhkVn@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Fri, Sep 10, 2010 at 03:12, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Robert Haas wrote:
>> On Thu, Sep 9, 2010 at 3:28 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> > Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> >> We certainly now have MANY documented field complaints at least of the
>> >> exit-128-on-Windows problem, if not the more general
>> >> backend-exits-without-going-through-the-normal-cleanup-path problem.
>> >
>> > Right, which is why I still don't care to risk back-porting a fix for
>> > the latter.
>>
>> It's hard to say what the safest option is, I think.  There seem to be
>> basically three proposals on the table:
>>
>> 1. Back-port the dead-man switch, and ignore exit 128.
>> 2. Don't back-port the dead-man switch, but ignore exit 128 anyway.
>> 3. Revert to Magnus's original solution.
>>
>> Each of these has advantages and disadvantages.  The advantage of #1
>> is that it is safer than #2, and that is usually something we prize
>> fairly highly.  The disadvantage of #1 is that it involves
>> back-porting the dead-man switch, but on the flip side that code has
>> been out in the field for over a year now in 8.4, and AFAIK we haven't
>> any trouble with it.  Solution #3 should be approximately as safe as
>> solution #1, and has the advantage of touching less code in the back
>> branches, but on the other hand it is also NEW code.  So I think it's
>> arguable which is the best solution.  I think I like option #2 least
>> as among those choices, but it's a tough call.
>
> Well, the dead-man timer is for all platforms, while the 128 return
> failure is Win32-only, so I don't see why applying the dead-man timer
> makes sense when it might destabalize all platforms, when the bug is
> just on Win32, and I don't think using defines to make the dead-man
> timer Win32-only makes sense.

Yes, that's the problem, really.

> I think we have clear enough evidence that 128 on Win32 means
> no-such-child and we can be sure the child never got started on that
> platform.

We have evidence that 128 occurs in this case. I don't think we have
evidence that there is no other case when this can happen, and we need
to investigate that some further to be *sure*.

We can safely say that *we* never do exit(128). What if a third party
library does it? Or the operating system itself? For the OS we can
check it, but do we care about third party libraries?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Valentine Gogichashvili 2010-09-10 08:35:43 Re: BUG #5644: Selecting ROW() in variable with 9.0 not compatible with 8.4
Previous Message Magnus Hagander 2010-09-10 07:45:00 Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2010-09-10 08:21:03 Re: git: uh-oh
Previous Message Magnus Hagander 2010-09-10 07:45:00 Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session