Skip site navigation (1) Skip section navigation (2)

Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Luke Koops <luke(dot)koops(at)entrust(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram
Date: 2012-08-07 18:22:23
Message-ID: 1410.1344363743@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-bugs
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> We just had a customer hit a very similar problem on 9.1.3, running on
> Windows Server 2008 SP2. ...
> The customer finds that they can reproduce this on a variety of
> systems under heavy load.

> Now, it looks to me like for this stack trace to happen,
> PgstatCollectorMain() has got to call pgwin32_waitforsinglesocket (at
> line 3002), and that function has to return true, so that got_data
> gets set to true.  Then PgstatCollectorMain() will call recv(), which
> on Windows will really be pgwin32_recv, which will call
> pgwin32_waitforsinglesocket, which must now hang.  The fact that the
> first pgwin32_waitforsinglesocket call returned true should mean that
> the stats collector socket is ready for read, while the fact that the
> second one did not return seems to imply that it's not ready for read,
> close, or accept.  So it almost looks like Windows can change its mind
> about whether the socket is readable.

> Or maybe we're telling it to change its mind.  This sounds an awful
> lot like something that could have been caused by the oversights fixed
> in commit b85427f2276d02756b558c0024949305ea65aca5.  Was there a
> reason we didn't back-patch that?

Sure: it was unproven that that fixed anything at all, much less that it
was bug-free enough to be safe to backpatch.  Neither of those things
has changed since May.  If you want you can try making up a 9.1 with
those changes and giving it to this customer to see if it fixes their
problems --- but without some field testing of the sort, I'm pretty
hesitant to put it into back branches.

			regards, tom lane

In response to

Responses

pgsql-bugs by date

Next:From: Tom LaneDate: 2012-08-07 19:02:48
Subject: Re: BUG #7483: uuid-ossp does not compile on OS X 10.8
Previous:From: Bruce MomjianDate: 2012-08-07 17:35:02
Subject: Re: BUG #6126: CC parameter in to_char() behaves incorrectly

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group