Index Page Split logging

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Index Page Split logging
Date: 2008-01-01 13:06:19
Message-ID: 1199192779.9558.256.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

When we split an index page we perform a multi-block operation that is
both fairly expensive and complex to reconstruct should we crash partway
through.

If we could log *only* the insert that caused the split, rather than the
split itself, we would avoid that situation entirely. This would then
mean that the recovery code would resolve the split by performing a full
logical split rather than replaying pieces of the original physical
split. Doing that would remove a ton of complexity, as well as reducing
log volumes.

We would need to ensure that the right-hand page of the split reached
disk before the left-hand page. If a crash occurs when only the right
hand page has reached disk then there would be no link (on disk) to it
and so it would be ignored. We would need an orphaned page detection
mechanism to allow the page to be VACUUMed sometime in the future. There
would also be some sort of ordering required in the buffer manager, so
that pages which must be written last are kept pinned until the first
page is written. That sounds like it is fairly straightforward and it
would allow a generic mechanism that worked for all index splits, rather
than requiring individual code for each rmgr.

ISTM that would require Direct I/O to perform physical writes in a
specific order, rather than just issue the writes and fsync. Which
probably kills it for now, even assuming you followed me on every point
up till now...

So I'm mentioning this really to get the idea out there and see if
anybody has any bright ideas, rather than as a well-formed proposal for
immediate implementation.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2008-01-01 15:23:29 8.3RC1 release date
Previous Message kenneth d'souza 2008-01-01 13:02:42 concurrency in psql