Possible PANIC in PostPrepare_Locks

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Possible PANIC in PostPrepare_Locks
Date: 2013-01-11 02:16:13
Message-ID: 15226.1357870573@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

While looking at the lock code the other day, I noticed this comment in
PostPrepare_Locks():

/*
* We cannot simply modify proclock->tag.myProc to reassign
* ownership of the lock, because that's part of the hash key and
* the proclock would then be in the wrong hash chain. So, unlink
* and delete the old proclock; create a new one with the right
* contents; and link it into place. We do it in this order to be
* certain we won't run out of shared memory (the way dynahash.c
* works, the deleted object is certain to be available for
* reallocation).
*/

I believe this comment was correct when it was written, since at that
time the routine would have held an exclusive lock on the entire shared
lock table. However, the subsequent changes to partition the lock
hashtables broke it, because a partitioned hashtable's freelist is not
partitioned. This means that another process working in a different
partition could grab the freelist entry before we can recycle it. In
principle, therefore, it's possible that the HASH_ENTER operation a few
lines below this could find itself out of shared memory and be unable to
make a new proclock table entry for the lock. If that happens, it
PANICs.

I don't see any very good way to avoid a PANIC on failure there, because
(1) we have already committed the prepared transaction; it's too late to
back out and throw a regular error. And (2) without any PROCLOCK, the
shared-memory representation of the lock would be corrupt anyhow;
there's no owner. So what we have to do is ensure it can't fail.

The only moderately clean solution that comes to mind is to create a new
dynahash function with the semantics of "atomically update this entry's
hash key". It would have the same behavior as the existing
remove-and-reenter code sequence, but it would never put the entry on
the freelist so there's no risk.

BTW, the only reason this is safe at all is that the new entry must
belong in the same partition as the old one, cf proclock_hash().
Possibly this comment should point that out, because I spent a few
minutes wondering if that was another bug.

Also, it looks like we'll need two code paths in PostPrepare_Locks to
deal with the possibility that a conflicting entry already exists?
I'm not sure this is possible, but I'm not sure it's not, either.

In principle this is an ancient bug that we ought to back-patch, but
since we've never heard of this PANIC happening in the field, it
probably is sufficient to fix it in HEAD. The odds of it biting someone
seem really small.

Thoughts?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aaron W. Swenson 2013-01-11 03:09:35 Re: PL/perl should fail on configure, not make
Previous Message Tom Lane 2013-01-11 01:12:56 Re: PL/perl should fail on configure, not make