Re: [PATCH]-hash index improving

From: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To: "Xiao Meng" <mx(dot)cogito(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, "Kenneth Marshall" <ktm(at)rice(dot)edu>
Subject: Re: [PATCH]-hash index improving
Date: 2008-07-17 20:24:28
Message-ID: 36e682920807171324l61e319cia60f5cbfbecf6ae@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 17, 2008 at 5:26 AM, Xiao Meng <mx(dot)cogito(at)gmail(dot)com> wrote:
> The patch store hash code only in the index tuple.
> It based on Neil Conway's patch with an old version of PostgreSQL.
> It passes the regression test but I didn't test the performance yet.
> Anyone interested can make a performance test;-)
> You can undefine the macro HASHVALUE_ONLY in hash.h to get the
> original implementation.
> It's a preliminary implementation and I'm looking for input here.
> Hope to hear from you.

I've spent some time today performing tests similar to those mentioned
here (http://archives.postgresql.org/pgsql-hackers/2007-09/msg00208.php)

Using a word list of 2650024 unique words (maximum length is 118
bytes), build times are still high, but I'm not really seeing any
performance improvements over b-tree. I haven't profiled it yet, but
my test is as follows:

- Created the dict table
- Loaded the dict table
- Counted the records in the dict table
- Created the index
- Shutdown the database
- Randomly selected 200 entries from the word list and built a file
full of (SELECT * FROM dict WHERE word = '<word>') queries using them.
- Cleared out the kernel cache
- Started the database
- Ran the query file

The result of this is between 5-10ms improvement in the overall
execution time of all 200 queries. The time-per-query is practically
unnoticeable. As this is in the range of noise, methinks there's a
larger problem with hash indexes. I haven't looked heavily into their
implementation, but do you any of you know of any major design flaws?

--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | jonah(dot)harris(at)enterprisedb(dot)com
Edison, NJ 08837 | http://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kenneth Marshall 2008-07-17 21:01:36 Re: [PATCH]-hash index improving
Previous Message Douglas McNaught 2008-07-17 19:21:43 Re: [HACKERS] postmaster.pid not visible