I've worked up two alternative patches for the btree deletion bug
The first one doesn't try to do anything about the underlying problem of
index keys becoming out-of-order; it just hacks up _bt_pagedel() to be
able to recover from the "failed to re-find parent key" condition.
The second patch instead attacks the underlying problem, by guaranteeing
that childless "half-dead" parent pages will be deleted before we allow
any further insertions into the transferred key space. This requires
two nontrivial changes: (1) a precheck step to refuse deletion of a page
if it would leave a childless but not immediately deletable parent at
any upper tree level; (2) additions to the WAL recovery code to complete
an incomplete series of deletions.
While I think the second patch is logically cleaner, it's certainly a
lot bigger and riskier. And as far as we know, the keys-out-of-order
condition does not have any other consequences that would justify this
much work to prevent it.
I am thinking of applying the bigger patch to HEAD (8.2) and using the
smaller patch for 7.4-8.1 branches. The bigger patch adds a new WAL
record type, so if we applied it to the back branches we'd be creating
an incompatibility, eg 8.1.6 WAL wouldn't load into an 8.1.5 postmaster.
I'm disinclined to do that when we don't know that it's fixing any real
bug. But one could argue that we should use the smaller patch for 8.2
as well, and hold the bigger patch for 8.3 ... or even not use it at all
in the absence of any demonstrated bug.
BTW, here's a reproducer for the problem, on machines with MAXALIGN 8.
Changing the constants a little would probably make it fail on MAXALIGN 4
too, but I haven't bothered trying.
create table foo(f1 int, f2 text);
insert into foo select x, repeat('xyzzy',100) from generate_series(1,10000) x;
create index fooi on foo(f1,f2);
delete from foo where f1 between 3000 and 3150;
insert into foo select 3010, repeat('xyzzy',100) from generate_series(1,2000) x;
delete from foo where f1 < 3000;
regards, tom lane
pgsql-patches by date
|Next:||From: Marc Munro||Date: 2006-10-31 20:59:32|
|Subject: Shared Memory Hooks Documentation (was Re: New sharedmemory hooks proposal)|
|Previous:||From: Alvaro Herrera||Date: 2006-10-31 19:56:59|
|Subject: Re: [HACKERS] WAL logging freezing|