Re: <note> on hash indexes

From: Kenneth Marshall <ktm(at)rice(dot)edu>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: <note> on hash indexes
Date: 2009-02-04 23:48:50
Message-ID: 20090204234849.GE2995@it.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 04, 2009 at 11:22:44PM +0100, Zdenek Kotala wrote:
> The main speed improvement is for varchar datatype. I think It should be
> mention here as well. IIRC, times are similar with B-Tree for integer
> datatype.
>
> Zdenek
>
> Kenneth Marshall p????e v st 04. 02. 2009 v 13:57 -0600:
> > I had submitted the documentation change as part of my
> > hash function patch but it was removed as not relevant.
> > (It wasn't really.) I would basically remove the first
> > sentence:
> >
> > Note: Hash index operations are not presently WAL-logged,
> > so hash indexes might need to be rebuilt with REINDEX after a
> > database crash. For this reason, hash index use is presently
> > discouraged.
> >
> > Ken
> >
> >
> > On Wed, Feb 04, 2009 at 01:22:23PM -0300, Alvaro Herrera wrote:
> > > Hi,
> > >
> > > indices.sgml contains this paragraph about hash indexes:
> > >
> > > Note: Testing has shown PostgreSQL's hash indexes to perform no
> > > better than B-tree indexes, and the index size and build time for hash
> > > indexes is much worse. Furthermore, hash index operations are not
> > > presently WAL-logged, so hash indexes might need to be rebuilt with
> > > REINDEX after a database crash. For these reasons, hash index use is
> > > presently discouraged.
> > >
> > >
> > > However, it seems to me that hash indexes are much improved in 8.4, so
> > > maybe this needs to be reworded. I'm not sure to what point they have
> > > been improved though.
> > >
> > > --
> > > Alvaro Herrera http://www.CommandPrompt.com/
> > > PostgreSQL Replication, Consulting, Custom Development, 24x7 support
> > >

The speed improvement applies particularly to any datatype that is
larger than an integer (typically 4 bytes). Also the size of fields that
can be indexed efficiently is much, much larger than the 2K for Btree.
And even 32-bit quantities may be indexed more efficiently than Btrees
for large indexes due to the O(1) probe behavior. Btrees typically need
to cache/probe the upper levels of the tree to locate the tuple. I have
held off on extensive benchmarking until WAL has been implemented.

Regards,
Ken

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2009-02-05 01:35:38 Re: Hot standby, recovery infra
Previous Message Dann Corbit 2009-02-04 23:40:37 Re: Is a plan for lmza commpression in pg_dump