Re: [PERFORM] BUG #2737: hash indexing large tablefails, while btree of same index works

From: Kenneth Marshall <ktm(at)it(dot)is(dot)rice(dot)edu>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Balazs Nagy <bnagy(at)thenewpush(dot)com>, pgsql-bugs(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] BUG #2737: hash indexing large tablefails, while btree of same index works
Date: 2006-11-17 14:26:57
Message-ID: 20061117142657.GQ12572@it.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-performance

On Fri, Nov 17, 2006 at 10:59:11AM +0000, Simon Riggs wrote:
> On Thu, 2006-11-16 at 17:48 -0500, Tom Lane wrote:
>
> > AFAICS, any hash index exceeding a single segment is at serious risk.
> > The fact that we've not heard gripes before suggests that no one is
> > using gigabyte-sized hash indexes.
>
> Seems so.
>
> > But it seems mighty late in the beta cycle to be making subtle changes
> > in the smgr API. What I'm inclined to do for now is to hack
> > _hash_expandtable() to write a page of zeroes at the end of each file
> > segment when an increment in hashm_ovflpoint causes the logical EOF to
> > cross segment boundary(s). This is pretty ugly and nonmodular, but it
> > will fix the bug without risking breakage of any non-hash code.
> > I'll revisit the cleaner solution once 8.3 devel begins. Comments?
>
> Do we think there is hope of improving hash indexes? If not, I'm
> inclined to remove them rather than to spend time bolstering them. We
> can remap the keyword as was done with RTREE. It's somewhat embarrassing
> having an index without clear benefit that can't cope across crashes. We
> wouldn't accept that in other parts of the software...
>

While I understand that there are currently serious problems in terms of
recovery with the hash index as it stands currently, it is theoretically
possible to get a result back with a single I/O probe from a hash and
that is not the case with btree indexes. At the top end of the performance
curve, every little bit that can be done to minimize actual I/Os is needed.
I certainly hold out some hope that they can improved. I would like to see
them still included. Once they are gone, it will be much harder to ever
add them back.

Ken Marshall

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Simon Riggs 2006-11-17 14:38:12 Re: [PERFORM] BUG #2737: hash indexing largetablefails, while btree of same index works
Previous Message Simon Riggs 2006-11-17 10:59:11 Re: [PERFORM] BUG #2737: hash indexing large tablefails, while btree of same index works

Browse pgsql-performance by date

  From Date Subject
Next Message Simon Riggs 2006-11-17 14:38:12 Re: [PERFORM] BUG #2737: hash indexing largetablefails, while btree of same index works
Previous Message Richard Huxton 2006-11-17 11:20:52 Re: Keeping processes open for re-use