Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Stefan Keller <sfkeller(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Hash index use presently(?) discouraged since 2005: revive or bury it?
Date: 2011-09-17 22:14:55
Message-ID: CAHyXU0wPJkYhQLioswNRe6YmXcC_JKZmDxXsb1jcLHo8+BuJkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, Sep 17, 2011 at 4:48 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Tue, Sep 13, 2011 at 5:04 PM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
>> On 14 September 2011 00:04, Stefan Keller <sfkeller(at)gmail(dot)com> wrote:
>>> Has this been verified on a recent release? I can't believe that hash
>>> performs so bad over all these points. Theory tells me otherwise and
>>> http://en.wikipedia.org/wiki/Hash_table seems to be a success.
>
> My understanding is that a huge amount of work has gone into making
> btree what it is in
> PG, and not nearly as much work has gone into making hash indexes what
> they could be.
>
>
>> Hash indexes have been improved since 2005 - their performance was
>> improved quite a bit in 9.0. Here's a more recent analysis:
>>
>> http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/
>
> They are 3 time faster to build.  But if you rip the WAL logging out
> of btree, how much faster would those get?
>
> Also, that link doesn't address concurrency of selects at all, only of inserts.

Of course hash indexes are faster to build than varlen string indexes
:-). I use natural keys 50-80% of the time and hash indexing would
remove some of the pain in cases where I don't need ordering and range
operations. In fact, if they are made to properly support wal logging
and uniqueness, I imagine they should supplant btree in a broad range
of cases, so much so that it would be awful nice to be able to have
syntax to choose hash for primary keys and unique constraints.

@ Jeff:
>I think that adding WAL to hash indexes without first
addressing the heavy-weight locking issue would be a mistake.
Even if the WAL was fixed, the bad performance under
concurrent selects would still make it at best a narrow
niche thing. And fixing the locking *after* WAL is in place would
probably be very much harder than the other order.

Here again, I think that any proposed improvement in the current hash
index code should be measured against wrapping a btree index. You
get wal logging and high concurrency for free if you decide to do
that.

merlin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2011-09-17 22:29:16 Re: Hash index use presently(?) discouraged since 2005: revive or bury it?
Previous Message Jeff Janes 2011-09-17 22:11:33 Re: Hash index use presently(?) discouraged since 2005: revive or bury it?