Re: Unique constraints for non-btree indexes

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unique constraints for non-btree indexes
Date: 2006-01-19 12:55:39
Message-ID: 20060119125539.GI9949@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 18, 2006 at 05:27:29PM -0500, Tom Lane wrote:
> Right, the deadlock risk is exactly the reason you need some secret
> sauce or other. Btree's page-level lock ensures that two insertions of
> conflicting keys can't overlap (even if they ultimately get stored on
> different pages). That's not the only way to fix this but it's a pretty
> good way.

Ok, I'm not that great with locking issues, but how about this:

1. In the root page of the GiST index you store a counter, let's call
it the Insertion ID (IID).
2. An index tuple has a state where it is not visible yet and contains
an IID.
3. When you go to insert, you follow the normal GiST method all the way
down to the leaf node.
4. Once you know where you're going to put it, take an exclusive lock
on the page.
5. Now, take an exclusive lock on the root page, read the IID (this one
is yours) and store the next one back in the root page.
6. Now write the index tuple to the leaf page recording your IID.
7. Release the root page lock, then the leaf page lock.

The thing to note is that once someone has gotten a particular IID, all
IIDs that are less than it will already have been written to the index
leaf pages. Also, there should be no risk of deadlock since we lock the
pages in a consistant order.

8. Do a normal index scan with your ~ operator. Ignore any provisional
tuples with IID greater than yours. If a match has IID less than yours
you may block on it using the same logic as for b-tree indexes.
9. Once the scan in completed and you know that there are no duplicates
with your IID or less, make the tuple fully visible (for your
transaction anyway). At this point the IID is no longer relevent and
you can forget it.

What I tried to acheive was avoiding holding any locks while doing
scans, since for GiST indexes they may cover a lot of ground.

Perhaps a better name for IID is Generation ID?

Storing the IID could take extra space, but you could probably overlap
it with the ctid since no-one's going to lookup the data tuple before
it's fully visible.

OTOH, if the backend dies before completing the process, how does one
clean up the provisional index tuple? The ctid would provide a way for
VACUUM to know when to remove it.

> BTW, the deadlock risk also applies to deferred uniqueness checks.
> Again, in btree it's possible to avoid this if you do a fresh indexscan
> (and take a lock on the first scanned page while you do that). If you
> try to do it without consulting the index then you need some other way
> to break "ties".

I think the above should work for deferred checks as well, as long as
you store the list as "tuples to check" for each inserted tuple. These
lists should form a directed acyclic graph so there should be no risk
of deadlock. Conflicts with VACUUM is still an issue.

Any thoughts?

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jonah H. Harris 2006-01-19 14:26:37 Re: No heap lookups on index
Previous Message Andrew Dunstan 2006-01-19 12:48:08 Re: pgxs/windows