Re: Hash index todo list item

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Shreya Bhargava <shreya_bhargav(at)yahoo(dot)com>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Hash index todo list item
Date: 2007-11-06 10:59:54
Message-ID: 4730492A.7090609@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Shreya Bhargava wrote:
> 1. Populate the table with 80 million tuples.
> 2. Create HASH index on the table.
> 3. clear both linux cache & psql buffers.
> (exiting psql and restarting it cleared the psql buffers;
> to clear linux cache, we used drop_cache command)
> 4. start psql
> 5. select on an integer in the range of values in the table.
> (all test numbers were big ones, like 98934599)
> 6. record the time.
> 7. exit psql.
> 8. drop caches.(as described above)
> 9. repeat 4-8 for different numbers.
> 10. Drop Hash index.
> 11. Create Btree index and repeat 3-9.

It seems you're mostly measuring the overhead of starting a backend,
populating the relcache etc.

Restarting psql doesn't clear the postgres shared buffer cache. Or did
you mean that you restarted postgres?

Anyway, I don't think it's interesting to test with cleared caches.
Surely the metapage and first 1-2 levels of the b-tree would stay cached
all the time in real life.

> From the results obtained, the average of all the hash probes is 141.8ms, the average for btree is 168.5, a difference of about 27.The standard deviations are about 23, so this is a statistically significant difference.

I don't trust those numbers much, but in any case I don't think that
edge is big enough to justify the existence of hash indexes.

If you're looking for a use case where hash index is faster, I'd suggest
using a data type with an expensive comparison function. Like long
multi-byte strings in UTF-8 encoding.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2007-11-06 11:03:11 Re: Visibility map thoughts
Previous Message Gokulakannan Somasundaram 2007-11-06 08:27:42 Re: Hash index todo list item