Re: Hash index todo list item

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Neil Conway" <neilc(at)samurai(dot)com>
Cc: "Kenneth Marshall" <ktm(at)rice(dot)edu>, <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Hash index todo list item
Date: 2007-09-07 11:55:37
Message-ID: 46E13C39.8060005@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Neil Conway wrote:
> You might find this patch useful:
>
> http://archives.postgresql.org/pgsql-patches/2005-05/msg00164.php

Oh, I had forgot about that.

> It implements the "just store the hash in the index" idea; it also sorts
> the entries in a bucket by the hash value, which allows binary search to
> be used to locate candidate matches.
>
> I was surprised that this didn't result in a performance improvement for
> the benchmarks that I ran, but I never got around to investigating
> further (either running more benchmarks or checking whether there was a
> bug in the implementation).

You did get a smaller index, which has cache benefits with larger
tables. You didn't compare the size comparison against b-tree, but a
smaller index is a good thing.

I think you left some money on the table on that front. Since all the
HashItems are fixed size, you can get rid of the line pointers
altogether. Given that a sizeof(HashItemData) is 8 bytes, and a line
pointer is 4 bytes, that should give a further 1/3 reduction in index
size. If you're willing to give up on the alignment of HashItemData, you
can get it down to 6 bytes.

Even from a CPU point of view, as the table becomes bigger and the
b-tree becomes deeper, the difference between O(1) and O(log n) lookup
starts to become more significant.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2007-09-07 12:09:11 Re: [FEATURE REQUEST] Streaming Onlinebackup (Maybe OFFTOPIC)
Previous Message Martijn van Oosterhout 2007-09-07 10:24:55 Re: [FEATURE REQUEST] Streaming Onlinebackup (Maybe OFFTOPIC)