Re: [PERFORM] BUG #2737: hash indexing large table fails, while btree of same index works

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Balazs Nagy" <bnagy(at)thenewpush(dot)com>, pgsql-bugs(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] BUG #2737: hash indexing large table fails, while btree of same index works
Date: 2006-11-16 22:48:16
Message-ID: 1884.1163717296@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-performance

"Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:
> On Fri, 2006-11-10 at 18:55 -0500, Tom Lane wrote:
>> One thought that comes to mind is to require hash to do an smgrextend()
>> addressing the last block it intends to use whenever it allocates a new
>> batch of blocks, whereupon md.c could adopt a saner API: allow
>> smgrextend but not other calls to address blocks beyond the current EOF.

> Yes, do it.

I found out that it's easy to reproduce this failure in the regression
tests, just by building with RELSEG_SIZE set to 128K instead of 1G:

*** ./expected/create_index.out Sun Sep 10 13:44:25 2006
--- ./results/create_index.out Thu Nov 16 17:33:29 2006
***************
*** 323,328 ****
--- 323,329 ----
--
CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops);
CREATE INDEX hash_name_index ON hash_name_heap USING hash (random name_ops);
+ ERROR: could not open segment 7 of relation 1663/16384/26989 (target block 145): No such file or directory
CREATE INDEX hash_txt_index ON hash_txt_heap USING hash (random text_ops);
CREATE INDEX hash_f8_index ON hash_f8_heap USING hash (random float8_ops);
-- CREATE INDEX hash_ovfl_index ON hash_ovfl_heap USING hash (x int4_ops);

AFAICS, any hash index exceeding a single segment is at serious risk.
The fact that we've not heard gripes before suggests that no one is
using gigabyte-sized hash indexes.

But it seems mighty late in the beta cycle to be making subtle changes
in the smgr API. What I'm inclined to do for now is to hack
_hash_expandtable() to write a page of zeroes at the end of each file
segment when an increment in hashm_ovflpoint causes the logical EOF to
cross segment boundary(s). This is pretty ugly and nonmodular, but it
will fix the bug without risking breakage of any non-hash code.
I'll revisit the cleaner solution once 8.3 devel begins. Comments?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Freddy Melo Acosta 2006-11-16 23:07:03 BUG #2761: Using NHibernate witn VS .NET 2.O
Previous Message Tom Lane 2006-11-16 21:08:02 Re: How to crash postgres using savepoints

Browse pgsql-performance by date

  From Date Subject
Next Message Brian Wipf 2006-11-17 00:03:21 shared_buffers > 284263 on OS X
Previous Message Jim Nasby 2006-11-16 21:01:28 Re: Context switch storm