Re: Tricky bugs in concurrent index build

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Tricky bugs in concurrent index build
Date: 2006-08-24 09:41:37
Message-ID: 878xletqr2.fsf@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> I wrote:
>> The problem case is that we take a tuple and try to insert it into the index.
>> Meanwhile someone else updates the tuple, and they're faster than us so
>> they get the new version into the index first. Now our aminsert sees a
>> conflicting index entry, and as soon as it commits good aminsert will
>> raise a uniqueness error. There's no backoff for "oh, the tuple I'm
>> inserting stopped being live while I was inserting it".
>
> It's possible that the problem could be solved by introducing such a
> backoff, ie, make aminsert recheck liveness of the tuple-to-be-inserted
> before declaring error. Since we're about to fail anyway, performance
> of this code path probably isn't a huge issue. But I haven't thought
> through whether it can be made to work with that addition.

Yesterday I considered if I could just catch the error in validate_index and
retest HeapSatisfiesVacuum after the insert but found that didn't work any
better. I don't remember the problem though and it's possible it would work if
it were inside aminsert.

> Unless someone's got a brilliant idea, my recommendation at this point
> is that we restrict the patch to building only non-unique indexes.
> Per discussion upthread, that's still a useful feature. We can revisit
> the problem of doing uniqueness checks correctly in some future release,
> but time to work on it for 8.2 is running out fast.

I agree. There's other functionality in this area that would be nice too such
as REINDEX CONCURRENTLY and deleting the invalid index in case of error. Once
one chunk gets into CVS it makes it easier to extend it without making for a
bigger divergence to merge in one day.

I was also considering going ahead and implementing Hannu's ALTER INDEX SET
UNIQUE too. We would have the option of making CREATE UNIQUE INDEX
CONCURRENTLY automatically invoke that code afterwards. It would require a
second waiting phase though and a full index scan so it would be a much slower
option than handling it in the index build. On the plus side it would never
have to lock anything -- locking things inside a command explicitly billed as
concurrent strikes me as undesirable.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zoltan Boszormenyi 2006-08-24 09:53:31 Re: [HACKERS] COPY view
Previous Message Jim C. Nasby 2006-08-24 09:19:21 Re: [PATCHES] selecting large result sets in psql using