From: | Kenneth Marshall <ktm(at)rice(dot)edu> |
---|---|
To: | "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com> |
Cc: | Xiao Meng <mx(dot)cogito(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [PATCH]-hash index improving |
Date: | 2008-07-17 21:01:36 |
Message-ID: | 20080717210135.GN337@it.is.rice.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jul 17, 2008 at 04:24:28PM -0400, Jonah H. Harris wrote:
> On Thu, Jul 17, 2008 at 5:26 AM, Xiao Meng <mx(dot)cogito(at)gmail(dot)com> wrote:
> > The patch store hash code only in the index tuple.
> > It based on Neil Conway's patch with an old version of PostgreSQL.
> > It passes the regression test but I didn't test the performance yet.
> > Anyone interested can make a performance test;-)
> > You can undefine the macro HASHVALUE_ONLY in hash.h to get the
> > original implementation.
> > It's a preliminary implementation and I'm looking for input here.
> > Hope to hear from you.
>
> I've spent some time today performing tests similar to those mentioned
> here (http://archives.postgresql.org/pgsql-hackers/2007-09/msg00208.php)
>
> Using a word list of 2650024 unique words (maximum length is 118
> bytes), build times are still high, but I'm not really seeing any
> performance improvements over b-tree. I haven't profiled it yet, but
> my test is as follows:
>
> - Created the dict table
> - Loaded the dict table
> - Counted the records in the dict table
> - Created the index
> - Shutdown the database
> - Randomly selected 200 entries from the word list and built a file
> full of (SELECT * FROM dict WHERE word = '<word>') queries using them.
> - Cleared out the kernel cache
> - Started the database
> - Ran the query file
>
> The result of this is between 5-10ms improvement in the overall
> execution time of all 200 queries. The time-per-query is practically
> unnoticeable. As this is in the range of noise, methinks there's a
> larger problem with hash indexes. I haven't looked heavily into their
> implementation, but do you any of you know of any major design flaws?
>
Jonah,
Thank you for running these tests. I was trying to reproduce my initial
tests on the original system to make it more apples to apples, but the
latest release needs more resources semaphore-wise than the 8.2 and
to fix it on Solaris 8 I will need a reboot. Would you mind posting
the timings for the hash_only index versus the hash_value versus the
btree index for the same test. Also, what is the on-disk size of all
three indexes? This will allow us to figure out the bucket/page load
or fill-factor for each scenario.
The basic implementation looked reasonable. I will take a look at
the patch this evening.
Regards,
Ken
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2008-07-17 21:03:48 | Re: autovacuum crash due to null pointer |
Previous Message | Jonah H. Harris | 2008-07-17 20:24:28 | Re: [PATCH]-hash index improving |