Re: Unsplitting btree index leaf pages

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unsplitting btree index leaf pages
Date: 2005-12-22 10:41:33
Message-ID: 1135248093.2964.413.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2005-12-21 at 19:07 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > While we scan, if we found two adjacent pages, both of which have less
> > than (say) 40% rows, we could re-join or "unsplit" those pages together.
>
> Curiously enough, this has been thought of before. It is not as easy
> as you think, or it would have been done the first time around.
> nbtree/README explains why:
>
> : We consider deleting an entire page from the btree only when it's become
> : completely empty of items. (Merging partly-full pages would allow better
> : space reuse, but it seems impractical to move existing data items left or
> : right to make this happen --- a scan moving in the opposite direction
> : might miss the items if so. We could do it during VACUUM FULL, though.)

Sorry, I missed that.

"Seems impractical"? Complex, yes. But I think it sounds a lot simpler
than the online REINDEX scheme... which is also the reason enhancing
VACUUM FULL doesn't appeal.

During the first VACUUM index scan, we can identify pages that are
candidates for reorganisation. Then during the second physical scan that
follows we can reorg pages in the same manner we delete them.

We identify a page during VACUUM for reorg if:
- it is < 20% full and we want to write this page
- the following page is < 20% full and we want to write this page
- it has a higher physical block number than the following page
That way we aren't super aggressive about reorg, but the cases we do
pick have a tendency to keep the index clustered. (Perhaps that could be
settable via an index optional parameter). That definition also means we
have almost no additional I/O over what the VACUUM would have written
anyway.

Page deletion requires us to lock left, lock right, lock above. That is
exactly the same if we have identified the page for reorg, since once we
have locked left and right, we just move rows to one or both of the
those other blocks, then perform the marking half-dead.

I know you've considered these things deeply, but my viewpoint is that
when what you have is already very good the only way forward consists of
considering seemingly foolish thoughts: backtracking.

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2005-12-22 10:53:39 Re: Recovery from multi trouble
Previous Message Peter Eisentraut 2005-12-22 10:32:58 Re: horology regression test failure