Re: Hash index todo list item

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Kenneth Marshall" <ktm(at)rice(dot)edu>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash index todo list item
Date: 2007-09-03 22:24:20
Message-ID: 87642rie0b.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


"Kenneth Marshall" <ktm(at)rice(dot)edu> writes:

> On Sun, Sep 02, 2007 at 10:41:22PM -0400, Tom Lane wrote:
>> Kenneth Marshall <ktm(at)rice(dot)edu> writes:
>> > ... This is the rough plan. Does anyone see anything critical that
>> > is missing at this point?
>>
>> Sounds pretty good. Let me brain-dump one item on you: one thing that
>> hash currently has over btree is the ability to handle index items up
>> to a full page. Now, if you go with a scheme that only stores hash
>> codes and not the underlying data, you can not only handle that but
>> improve on it;

I think that would be a big selling point for hash indexes. It would let you
index even toasted data which are larger than a page. I'm not sure whether you
can make it work for unique indexes though. But for non-unique indexes I think
it would be a solid win and mean you cover a set of use cases quite distinct
from btree indexes.

> - Hash lookup is O(1) while btree is O(logN).

That's not really true. There's a tradeoff between insertion time and lookup
time. In order to get O(1) lookups you need to work pretty hard to maintain
the hash table including spending a lot of time reorganizing it when you grow
it. If you don't want to spend the time on inserts then you end up with
buckets and the hash table is basically just a linear speedup to whatever
algorithm you use to scan the buckets.

> - What about multi-column indexes? The current implementation
> only supports 1 column.

That seems kind of weird. It seems obvious that you mix the three hashes
together which reduces it to the solved problem.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian G. Pflug 2007-09-03 23:15:18 Re: Per-function GUC settings: trickier than it looked
Previous Message Tom Lane 2007-09-03 21:26:23 Re: Per-function GUC settings: trickier than it looked