Tom Lane wrote:
> Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
>> Ok, I think I know what's happening. In btbulkdelete we have a
>> PG_TRY-CATCH block. In the try-block, we call _bt_start_vacuum which
>> acquires and releases the BtreeVacuumLock. Under certain error
>> conditions, _bt_start_vacuum calls elog(ERROR) while holding the
>> BtreeVacuumLock. The PG_CATCH block calls _bt_end_vacuum which also
>> tries to acquire BtreeVacuumLock.
> This is definitely a bug (I unfortunately didn't see your message until
> after I'd replicated your reasoning...) but the word from Shuttleworth
> is that he doesn't see either of those messages in his postmaster log.
> So it seems we need another theory. I haven't a clue at the moment though.
The error message never makes it to the log. The deadlock occurs in the
PG_CATCH-block, before rethrowing and printing the error. I added an
unconditional elog(ERROR) in _bt_start_vacuum to test it, and I'm
getting the same hang with no message in the log.
The unsafe elog while holding a lwlock pattern in _bt_vacuum_start needs
to be fixed, patch attached. We still need to figure out what's causing
the error in the first place. With the patch, we should at least get a
proper error message and not hang when the error occurs.
Martin: Would it be possible for you to reproduce the problem with a
In response to
pgsql-bugs by date
|Next:||From: Martin Pitt||Date: 2007-03-30 08:40:28|
|Subject: Re: postmaster 8.2 eternally hangs in sempaphore lock acquiring|
|Previous:||From: Tom Lane||Date: 2007-03-30 00:19:23|
|Subject: Re: postmaster 8.2 eternally hangs in sempaphore lock acquiring |