Skip site navigation (1) Skip section navigation (2)

Re: Tricky bugs in concurrent index build

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Tricky bugs in concurrent index build
Date: 2006-08-26 14:00:02
Message-ID: 87zmdrmwbh.fsf@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Barring objections, I'm off to program this.

A few concerns

a) The use of ShareUpdateExclusiveLock is supposed to lock out concurrent
   vacuums. I just tried it and vacuum seemed to be unaffected. I'm going to
   retry it with a clean cvs checkout to be sure it isn't something in my
   local tree that's broken.

   Do we still need to block concurrent vacuums if we're using snapshots?
   Obviously we have to block them during phase 1 because it won't have a
   chance of removing the tuples from our private collection of index tuples
   that haven't been pushed live yet. But if phase 2 is ignoring tuples too
   new to be visible in its snapshot then it shouldn't care if dead tuples are
   deleted even if those slots are later reused.


b) You introduced a LockRelationIdForSession() call (I even didn't realize we
   had this capability when I wrote the patch). Does this introduce the
   possibility of a deadlock though? If one of the transactions we're waiting
   to finish has a shared lock on the relation and is waiting for an exclusive
   lock on the relation then it seems we'll wait forever for it to finish and
   never see either of our conditions for continuing. That would be fine
   except because we're waiting manually the deadlock detection code doesn't
   have a chance of firing.

   To solve that we would have to replace the pg_sleep call with a
   XactLockTableWait. But I'm not clear how to find a transaction id to wait
   on. What we would want to find is any transaction id that has an xmin older
   than our xmin. Even that isn't ideal since it wouldn't give us a chance to
   test our other out so if we choose a transaction to wait on that doesn't
   hold even a share lock on our table we could end up stuck longer than
   necessary (ie when we would have been able to momentarily acquire the
   exclusive lock on the table earlier).


c) It's a shame we don't support multiple concurrent concurrent index builds.
   We could create a ShareUpdateShareLock that conflicts with the same list of
   locks that ShareUpdateExclusiveLock conflicts with but not itself. From a
   UI point of view there's no excuse for not doing this, but from an
   implementation point of view there's a limit of 10 lock types and this
   would get up to 9. This is my first time looking at this code so I'm not
   sure how hard that limit is.

   One caveat is that the two jobs would see each other and that would make it
   hard for them to proceed to phase 2. I think what would happen is that the
   first one to finish phase 1 would be able to continue as soon as the other
   finishes phase 1. The second one would have to wait until the first one's
   phase 2 finished.


-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

In response to

Responses

pgsql-hackers by date

Next:From: Jonah H. HarrisDate: 2006-08-26 14:10:57
Subject: Re: integration of pgcluster into postgresql
Previous:From: Markus SchiltknechtDate: 2006-08-26 13:06:59
Subject: Re: integration of pgcluster into postgresql

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group