Quick Links

Re: Weird Assert failure in GetLockStatusData()

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: Weird Assert failure in GetLockStatusData()
Date:	2013-01-08 21:51:25
Message-ID:	29413.1357681885@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I wrote:
> This is a bit disturbing:
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bushpig&dt=2013-01-07%2019%3A15%3A02
> ...
> The assertion failure seems to indicate that the number of
> LockMethodProcLockHash entries found by hash_seq_search didn't match the
> number that had been counted by hash_get_num_entries immediately before
> that. I don't see any bug in GetLockStatusData itself, so this suggests
> that there's something wrong with dynahash's entry counting, or that
> somebody somewhere is modifying the shared hash table without holding
> the appropriate lock. The latter seems a bit more likely, given that
> this must be a very low-probability bug or we'd have seen it before.
> An overlooked locking requirement in a seldom-taken code path would fit
> the symptoms.

After digging around a bit, I can find only one place where it looks
like somebody might be messing with the LockMethodProcLockHash table
while not holding the appropriate lock-partition LWLock(s):

1. VirtualXactLock finds target xact holds its VXID lock fast-path.
2. VirtualXactLock calls SetupLockInTable to convert the fast-path lock
to regular.
3. SetupLockInTable makes entries in LockMethodLockHash and
LockMethodProcLockHash.

I see no partition lock acquisition anywhere in the above code path.
Is there one that I'm missing? Why isn't SetupLockInTable documented
as expecting the caller to hold the partition lock, as is generally
done for lock.c subroutines that require that?

If this is a bug, it's rather disturbing that it took us this long to
recognize it. That code path isn't all that seldom-taken, AFAIK.

regards, tom lane

In response to

Weird Assert failure in GetLockStatusData() at 2013-01-08 15:39:25 from Tom Lane

Responses

Re: Weird Assert failure in GetLockStatusData() at 2013-01-08 22:30:35 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrew Dunstan	2013-01-08 22:01:00	Re: json api WIP patch
Previous Message	Tomas Vondra	2013-01-08 21:38:52	Re: PATCH: optimized DROP of multiple tables within a transaction