Quick Links

Re: Hash Indexes

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Hash Indexes
Date:	2016-09-21 12:29:59
Message-ID:	CA+TgmoYF3O5=MYJ8HYH-g5RuKKX3BD0yGXDP7vU5JiUamwgOjw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Sep 20, 2016 at 7:55 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> If it turns out that it has little benefit, then we don't really need
>> to step up our user education. People can just keep using btree like
>
> The big problem is people coming from other databases and assuming our
> hash indexes have the same benefits over btree that exist in some other
> database software. The 9.5 warning at least helps with that.

I'd be curious what benefits people expect to get. For example, I
searched for "Oracle hash indexes" using Google and found this page:

http://logicalread.solarwinds.com/oracle-11g-hash-indexes-mc02/

It implies that their hash indexes are actually clustered indexes;
that is, the table data is physically organized into contiguous chunks
by hash bucket. Also, they can't split buckets on the fly. I think
the DB2 implementation is similar. So our hash indexes, even once we
add write-ahead logging and better concurrency, will be somewhat
different from those products. However, I'm not actually sure how
widely-used those index types are. I wonder if people who use hash
indexes in PostgreSQL are even likely to be familiar with those
technologies, and what expectations they might have.

For PostgreSQL, I expect the benefits of improving hash indexes to be
(1) slightly better raw performance for equality comparisons and (2)
better concurrency. The details aren't very clear at this stage. We
know that write performance is bad right now, even with Amit's
patches, but that's without the kill_prior_tuple optimization which is
probably extremely important but which has never been implemented for
hash indexes. Read performance is good, but there are still further
optimizations that haven't been done there, too, so it may be even
better by the time Amit gets done working in this area.

Of course, if we want to implement clustered indexes, that's going to
require significant changes to the heap format ... or the ability to
support multiple heap storage formats. I'm not opposed to that, but I
think it makes sense to fix the existing implementation first.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Hash Indexes at 2016-09-20 23:55:36 from Bruce Momjian

Responses

Re: Hash Indexes at 2016-09-21 16:49:15 from Oskari Saarenmaa
Re: Hash Indexes at 2016-09-21 19:12:43 from Bruce Momjian
Re: Hash Indexes at 2016-09-21 19:44:15 from Geoff Winkless

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2016-09-21 12:43:44	Re: pageinspect: Hash index support
Previous Message	Jesper Pedersen	2016-09-21 12:27:06	Re: pageinspect: Hash index support