Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram

From: Nikhil Sontakke <nikhil(dot)sontakke(at)enterprisedb(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Luke Koops <luke(dot)koops(at)entrust(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram
Date: 2009-08-03 13:47:17
Message-ID: a301bfd90908030647l350f6348ne1231bb9fd113280@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

>>
>>
>>> ntdll.dll!NtWaitForMultipleObjects+0xc
>>> kernel32.dll!WaitForMultipleObjectsEx+0x11a
>>> postgres.exe!pgwin32_waitforsinglesocket+0x1ed
>>> postgres.exe!pgwin32_recv+0x90
>>> postgres.exe!PgstatCollectorMain+0x17f
>>> postgres.exe!SubPostmasterMain+0x33a
>>> postgres.exe!main+0x168
>>> postgres.exe!__tmainCRTStartup+0x10f
>>> kernel32.dll!BaseProcessStart+0x23
>>
>> I have seen this problem too.  The process seems stuck for no good
>> reason.  I wondered at the time if it could be a kernel issue.  I
>> remember trying to send some data to the collector to verify whether
>> it'd wake up, but no luck.  (I mean I couldn't find a way to do it on
>> Windows).
>
> I have seen this as well, but only in cases where there has been
> broken firewall software or such things involved. I have seen a couple
> of reports from the field though.
>
> Anyway, this really is a should-never-happen thing. As soon as a new
> packet is sent in, WaitForMultipleObjectsEx() should return right
> away. And given that backends regularly send packets over, it
> shouldn't be an issue even if we miss one...
>

And this fact should lend credence to Alvaro's (as well as mine)
suspicions that it seems to be a Windows kernel issue.

As a consequence, Magnus I was wondering if having a loop similar to
the WRITE handling of waiting for a fixed timeout in a loop (rather
than an INFINITE call to WaitForMultipleObjectsEx) inside the
pgwin32_waitforsinglesocket() function will help for the READ case
too? I believe Teogor Sigaev had raised a similar concern a while back
about it:

http://www.nabble.com/-GENERAL--Stats-collector-frozen--td8569977i20.html

Regards,
Nikhils
--
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Magnus Hagander 2009-08-03 13:56:35 Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram
Previous Message wader2 2009-08-03 12:18:02 BUG #4961: pg_standby.exe crashes with no args