Quick Links

Re: Hash Indexes

From:	Sam Mason <sam(at)samason(dot)me(dot)uk>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Hash Indexes
Date:	2008-01-07 16:24:04
Message-ID:	20080107162404.GS11262@frubble.xen.chris-lamb.co.uk
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Tue, Jan 08, 2008 at 01:49:53AM +1100, Naz Gassiep wrote:
> >You could always do something like:
> >
> >CREATE INDEX foo ON table((md5(textcol)));
> >
> >Then it will get used in queries like:
> >SELECT * FROM table WHERE md5(textcol) = md5('text');
>
> That's exactly what I was considering doing, however there is always the
> change of a hash collision. Yes, this is a very remote chance, however
> the ramifications of a collision under those circumstances is
> potentially catastrophic.

You could make it a UNIQUE index (i.e. CREATE UNIQUE INDEX and the rest
like above) if you wanted, or you could perform the query as:

SELECT * FROM table
WHERE md5(textcol) = md5('text')
AND textcol = 'text';

this should use the index to do the initial lookup and then filter out
colliding entries.

> I would assume that hash indexes have inbuilt mechanisms for collision
> checking before returning the row as a match. Am I correct in this
> assumption?

The above isn't using hash indexes in any way. You're creating a b-tree
index on top of the md5-hash of a column. The only index type that
support uniqueness constraints at the moment are b-tree indexes[1].

Sam

[1] http://www.postgresql.org/docs/current/static/sql-createindex.html#AEN47593

In response to

Re: Hash Indexes at 2008-01-07 14:49:53 from Naz Gassiep

Browse pgsql-general by date

	From	Date	Subject
Next Message	Thomas Pundt	2008-01-07 16:27:06	Re: File system level backup from 32bit to 64bit
Previous Message	Martijn van Oosterhout	2008-01-07 16:16:08	Re: Hash Indexes