"Marshall, Steve" <smarshall(at)wsi(dot)com> writes:
> I don't think a check for process existance is a bad idea, or even a
> bandaid. The comment in the code block in async.c says it is removing
> the entry in pg_listener because the backend process does not exist.
In general, the way to see if a process exists is to try to kill() it;
the fact that the kill failed is sufficient proof, at least in
Unix-land. If it's possible for kill() to fail for transient reasons
in our Windows implementation, that's a bug in the Windows emulation
Another reason behind the async.c coding is that even if the process
does still exist, there's no point in maintaining a pg_listener entry
for it if we can't signal it.
Thirdly, this is hardly the only place where we expect kill() to work
reliably. You've managed to create a reproducible case illustrating
that it's not being reliable, but the same bug might account for other
failures much harder to reproduce and investigate.
So my opinion is that the real issue here is why is the kill()
implementation failing when it should not. We need to fix that,
not put band-aids in async.c.
As to how to fix it, I'll defer to other people more
Windows-knowledgeable. Maybe taking out the timeout is really
the best answer.
regards, tom lane
In response to
pgsql-bugs by date
|Next:||From: Teodor Sigaev||Date: 2009-01-28 18:47:35|
|Subject: Re: server crash when tsearch2 function is called from update
|Previous:||From: Marshall, Steve||Date: 2009-01-28 18:11:26|
|Subject: Re: pg_listener entries deleted under heavy NOTIFY load only on Windows |