From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | Corner cases with GiST n-way splits |
Date: | 2012-05-10 17:14:25 |
Message-ID: | 4FABF771.9050304@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
GiST page splitting has the peculiarity that it sometimes needs to split
a single page into more than two pages. It happens rarely in practice,
but it possible (*). With a bad picksplit function, it happens more often.
While testing with a custom gist opclass with truly evil helper
functions, I found two corner cases with the current implementation when
a page is split into many halves:
1. If a page is split into more than 100 pages, you run into the same
limit of 100 simultaneous lwlocks that Tom Forbes reported with a
pathological intarray index. This time it's not because we hold locks on
many different levels, but because of a single split.
2. When the root page is split, there is no parent page to update, so we
just create a new root page with the downlinks. However, when you split
a page into a lot of siblings, it's possible that all the downlinks
don't fit on a single page. The code is prepared for that situation. You
get an error, when it tries to add more downlinks on a single page than
fit there.
I'm not sure what to do about these. Neither issue is something you'd
actually bump into in an index that's doing something useful; there's
been no user complaints about these.
(*) What follows is an explanation of how a page can be split into more
than two halves, to help you (and me!) understand this:
In a very pathological case, it's possible for a single insertion to
cause a page to be split into hundreds of pages. Imagine that you have a
page full of very small tuples (let's imagine that a page can hold 8
letters, and ignore all tuple overhead for now):
A B C D E F G H
Now you insert one large tuple on the page, DDDD. picksplit algorithm
can choose to split this as:
A - B C D E F G H DDDD
The right side is still too large to on a single page, so it's
iteratively split again:
A - B - C D E F G H DDDD
And again:
A - B - C - D E F G H DDDD
And again:
A - B - C - D - E F G H DDDD
In this example, the page was split into 5 halves, but in reality a page
can hold many more tuples, and the difference between a small and a
large tuple can be much greater, so you can end up with many more
siblings in one split.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2012-05-10 17:16:13 | Re: Draft release notes complete |
Previous Message | Bruce Momjian | 2012-05-10 17:14:00 | Re: Draft release notes complete |