Re: GiST insert algorithm rewrite

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: GiST insert algorithm rewrite
Date: 2010-12-13 12:09:06
Message-ID: 4D060CE2.1040601@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 03.12.2010 23:54, Heikki Linnakangas wrote:
> There's one bug remaining that I found during testing. If you crash,
> leaving an incomplete split behind, and then vacuum the table removing
> all the aborted tuples from the pages, it's possible that you end up
> with a completely empty page that has no downlink yet. The code to
> complete incomplete splits doesn't cope with that at the moment - it
> doesn't know how to construct a parent key for a child that has no tuples.
>
> The nicest way to handle that would be to recycle the empty page instead
> of trying to finish the page split, but I think there might be a race
> condition there if the page gets quickly reused while a scan is just
> about to visit it through the rightlink. GiST doesn't seem to ever reuse
> pages in normal operation, which conveniently avoids that problem.
> Simply abandoning the page forever is certainly one way to handle it, it
> shouldn't happen that often.

I fixed that by simply aobandoning pages. That seems acceptable, given
that you shouldn't crash with incomplete splits that often. GiST never
tries to reuse empty pages anyway, so some leakage at a crash seems like
the least of our worries on that front.

I also added check for the F_FOLLOW_RIGHT flag in the gist scan code.
Tom Lane pointed out offlist that it was missing earlier.

I realized that the way I was setting the NSN and clearing the flag on
child pages, when the downlink is inserted to the parent, was not safe.
We need to update the LSN and take a full-page image per the usual rules.

But that creates a new problem: There's a maximum of three backup blocks
per WAL record, but a GiST page can be split into any number of child
pages as one operation. You might run out of backup block slots.

Attached is an updated patch, but that issue with limited number of
backup blocks needs to be resolved. The straightforward way would be to
change the WAL format to increase the limit. Another option is to
refactor the GiST insertion code some more, to insert the downlink
pointers to the parent one-by-one, instead of as one big operation, when
a page is split into more than two halves.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
gist-insert-rewrite-4.patch text/x-diff 113.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-12-13 12:53:43 Re: rest of works for security providers in v9.1
Previous Message Peter Geoghegan 2010-12-13 11:54:12 Re: ALTER TABLE ... ADD FOREIGN KEY ... NOT ENFORCED