Re: server crash with "process 22821 releasing ProcSignal slot 32, but it contains 0"

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: server crash with "process 22821 releasing ProcSignal slot 32, but it contains 0"
Date: 2012-08-09 21:26:03
Message-ID: CAHyXU0xqeQFZ_qcVyUBubJB+tQ3rAb7g0yp+m1YuLTWGHtu70w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Jun 26, 2012 at 12:09 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Tue, Jun 26, 2012 at 12:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>> I suspect (but haven't had time to prove and may not for several days
>>> -- unfortunately going on vacation momentarily) that this might be
>>> caused by pl/sh.
>>
>> Hm. The reported symptoms might be explainable if something had caused
>> multiple threads to become active within the backend process --- then
>> it would be plausible for it to try to do proc_exit cleanup twice.
>> Which would explain the first two errors, though I'm not sure how that
>> leads to failing to disown the process latch, as the third error
>> suggests must have happened. But I don't know enough about pl/sh to
>> know if it could cause threading activation.
>>
>>> In particular, we have a routine that was
>>> inadvertently applied to the database in with windows cr/lf instead of
>>> the normal linux newline.
>>
>> This doesn't seem real promising as an explanation ...
>
> right -- just a suspicion. maybe the relevant point was that it
> immediately failed. operator invoking the busted routine (which I had
> to fix) and the crash were highly correlated, although it does not
> always crash. yesterday was very heavy load and today not so much.

Follow up on this. It is pl/sh and it is a newline issue: one of the
developers is using a tool (I think pgadmin?) that is sticking \r
characters at the end of every line which is throwing off pl/sh's
shebang parsing. The issuing query gets an error along the lines of
'could not exec' and the server goes belly up if there is significant
concurrent load when that's issued. This is an out of date pl/sh, so
I'm going to upgrade it and try and reproduce. If I still can, I'll
supply a test case.

merlin

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message andersonabreu 2012-08-10 17:33:23 BUG #7486: Error Group by
Previous Message Dave Page 2012-08-09 14:55:09 Re: BUG #6722: Debugger broken?