Re: [PERFORM] BUG #2737: hash indexing large tablefails, while btree of same index works

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Balazs Nagy" <bnagy(at)thenewpush(dot)com>, <pgsql-bugs(at)postgresql(dot)org>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [PERFORM] BUG #2737: hash indexing large tablefails, while btree of same index works
Date: 2006-11-17 10:59:11
Message-ID: 1163761151.27956.592.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-performance

On Thu, 2006-11-16 at 17:48 -0500, Tom Lane wrote:

> AFAICS, any hash index exceeding a single segment is at serious risk.
> The fact that we've not heard gripes before suggests that no one is
> using gigabyte-sized hash indexes.

Seems so.

> But it seems mighty late in the beta cycle to be making subtle changes
> in the smgr API. What I'm inclined to do for now is to hack
> _hash_expandtable() to write a page of zeroes at the end of each file
> segment when an increment in hashm_ovflpoint causes the logical EOF to
> cross segment boundary(s). This is pretty ugly and nonmodular, but it
> will fix the bug without risking breakage of any non-hash code.
> I'll revisit the cleaner solution once 8.3 devel begins. Comments?

Do we think there is hope of improving hash indexes? If not, I'm
inclined to remove them rather than to spend time bolstering them. We
can remap the keyword as was done with RTREE. It's somewhat embarrassing
having an index without clear benefit that can't cope across crashes. We
wouldn't accept that in other parts of the software...

If there is hope, is there a specific place to look? It would be good to
brain dump some starting places for an investigation.

Does anybody have a perf test that shows hash indexes beating btrees by
any significant margin? (Not saying there isn't one...)

I can see the argument that fixed hash indexes would be faster, but
there are obviously major downsides to that approach.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kenneth Marshall 2006-11-17 14:26:57 Re: [PERFORM] BUG #2737: hash indexing large tablefails, while btree of same index works
Previous Message Freddy Melo Acosta 2006-11-16 23:07:03 BUG #2761: Using NHibernate witn VS .NET 2.O

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Huxton 2006-11-17 11:20:52 Re: Keeping processes open for re-use
Previous Message Brian Wipf 2006-11-17 00:03:21 shared_buffers > 284263 on OS X