From: | Kenneth Marshall <ktm(at)rice(dot)edu> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Hash index todo list item |
Date: | 2007-09-10 02:42:58 |
Message-ID: | 20070910024258.GC5679@it.is.rice.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Sep 02, 2007 at 10:41:22PM -0400, Tom Lane wrote:
> Kenneth Marshall <ktm(at)rice(dot)edu> writes:
> > ... This is the rough plan. Does anyone see anything critical that
> > is missing at this point?
>
> Sounds pretty good. Let me brain-dump one item on you: one thing that
> hash currently has over btree is the ability to handle index items up
> to a full page. Now, if you go with a scheme that only stores hash
> codes and not the underlying data, you can not only handle that but
> improve on it; but if you reduce the bucket size and don't remove the
> data, it'd be a step backward. The idea I had about dealing with that
> was to only reduce the size of primary buckets --- if it's necessary to
> add overflow space to a bucket, the overflow units are still full pages.
> So an index tuple up to a page in size can always be accommodated by
> adding an overflow page to the bucket.
>
> Just a thought, but AFAIR it's not in the archives anywhere.
>
> regards, tom lane
>
I was thinking about this some more, and it strikes me that we can
keep the page size = bucket size = overflow size in the new scheme
of storing the hash value in the index and still reduce the effective
bucket size. Let's make the new size the (page size / 2^n) where n
is chosen appropriately. Then we we store the value in the page we
simply use n bits of the hash to determine which sub-piece of the
page to use to actually store the value. We may need to add a multiplier
to adjust the decision to split the page based on the mini-page. This
should allow us to much more densely pack the pages and approach the
1 item per bucket. This could easily shrink the size of the index by
a factor of 2^n. Let me know what you think.
Regards,
Ken
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-09-10 03:22:26 | Re: invalidly encoded strings |
Previous Message | Tom Lane | 2007-09-10 01:52:33 | Re: Are we done with sync-commit-defaults-to-off patch? |