Unsplitting btree index leaf pages

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Unsplitting btree index leaf pages
Date: 2005-12-22 00:00:27
Message-ID: 1135209627.2964.351.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

When we discussed online REINDEX recently we focused on the REINDEX
command itself rather than look at alternative approaches.

One reason to REINDEX is because of index page splits getting things out
of sequence and generally bloating the index.

When we VACUUM, each index is scanned in logical order.

While we scan, if we found two adjacent pages, both of which have less
than (say) 40% rows, we could re-join or "unsplit" those pages together.
The index blocks are fully locked during the read anyway and there is no
MVCC problem with moving index rows between blocks. All we have to do is
to lock both blocks, having locked them in the correct order.

The rows would always be moved to the lowest physical block id, so that
data would naturally migrate towards the start of the index file. Blocks
would then be marked half-dead just as if they had just had their last
index row removed by the vacuum.

We could start the scan by locking block 1 AND block2, then scan forward
always holding 2 locks as we go. That way we would not need to unlock
and relock the blocks to lock two blocks. The concurrency loss would not
be that great, but we would gain the ability to unsplit the two blocks
into one.

If we do this, we would could possibly avoid the need to full REINDEX
entirely.

If this method checks out we could do one of:
- make VACUUM do this always
- add an option for REINDEX: CLEAN/COMPRESS/VACUUM etc to do this upon
command only

Best Regards, Simon Riggs

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-12-22 00:07:23 Re: Unsplitting btree index leaf pages
Previous Message Tom Lane 2005-12-21 23:44:50 Re: horology regression test failure