Re: Tricky bugs in concurrent index build

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Tricky bugs in concurrent index build
Date: 2006-08-26 17:42:34
Message-ID: 13849.1156614154@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
>> b) You introduced a LockRelationIdForSession() call (I even didn't realize we
>> had this capability when I wrote the patch). Does this introduce the
>> possibility of a deadlock though?

> Indeed, I had noticed this while testing your point (a). I thought that
> ConditionalLockRelation had enough smarts to grant itself a lock if
> there would be a deadlock possibility otherwise, but it seems not to be
> noticing. We'll need to look into that.

OK, the code I was remembering is the bit in ProcSleep to grant our proc
the lock if we can prove that deadlock detection would do so anyway.
It is probably possible to refactor things so that that's done before
we fail in the ConditionalLockAcquire case, but it would involve
uglifying ProcSleep's API even more. And it doesn't sound like a
complete solution anyway --- the ProcSleep test is not capable of
detecting deadlocks involving more than one lock, and I'm not sure it
even intends to find all cases involving just one lock. It was only
ever meant as a simple optimization to avoid a sleep in easy cases.

>> To solve that we would have to replace the pg_sleep call with a
>> XactLockTableWait. But I'm not clear how to find a transaction id to wait
>> on.

I'm toying with the idea of adding a lock manager call defined as
"give me a list of XIDs that currently hold locks conflicting with
lockmode X on object Y" --- but not including XIDs merely waiting
for such a lock. Then we could get the list of XIDs currently blocking
ExclusiveLock, and do XactLockTableWait on each one. Once they are
all gone we are safe; we don't care about later acquirers of locks
because they must see the index. This avoids both the arbitrary
pg_sleep and the need for a check on "am I the oldest remaining XID";
and if we are in a deadlock situation it will be detected.

This looks like it should be easy enough to code (though I haven't
actually tried yet) ... do you see any logical flaws?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chahine Hamila 2006-08-26 18:18:04 Re: integration of pgcluster into postgresql
Previous Message Bruce Momjian 2006-08-26 17:31:20 Re: [HACKERS] New XML section for documentation