Re: postmaster 8.2 eternally hangs in sempaphore lock acquiring

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martin Pitt <martin(at)piware(dot)de>
Cc: PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: postmaster 8.2 eternally hangs in sempaphore lock acquiring
Date: 2007-03-29 18:38:07
Message-ID: 26770.1175193487@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I wrote:
> It's possible that this is not a deadlock per se, but the aftermath of
> someone having errored out without releasing the BtreeVacuumLock --- but
> I don't entirely see how that could happen either, at least not without
> a core dump scenario.

On closer inspection, the autovac stack trace

#4 0x080abe38 in _bt_end_vacuum (rel=0xb5f0b298) at nbtutils.c:1028
#5 0x080a9c68 in btbulkdelete (fcinfo=0xbfc58cd8) at nbtree.c:552

suggests that _bt_end_vacuum is called from the CATCH part of
btbulkdelete, and that provides an idea: if either of the elog(ERROR)
calls in _bt_start_vacuum were to actually fire, it would throw control
without having released BtreeVacuumLock, and then _bt_end_vacuum would
hang up. _bt_start_vacuum is coded on the assumption that the LWLock
would get released by transaction abort cleanup, but we'd fail before
getting there. So this is definitely a bug, but the next question is
what's triggering it --- both of those elogs should be "can't happen"
conditions.

> Is there anything in the postmaster log when this happens?

I repeat that with more urgency. Do you see any
"multiple active vacuums for index \"%s\"" or "out of btvacinfo slots"
log messages when these hangups occur?

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2007-03-29 19:37:25 Re: undefined symbol: krb5_cc_get_principal
Previous Message Mark Shuttleworth 2007-03-29 18:20:58 Re: postmaster 8.2 eternally hangs in sempaphore lock acquiring