Re: B-tree parent pointer and checkpoints

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Subject: Re: B-tree parent pointer and checkpoints
Date: 2011-10-11 19:57:56
Message-ID: 201110111957.p9BJvv016941@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas wrote:
> On 11.03.2011 19:41, Tom Lane wrote:
> > Heikki Linnakangas<heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> >> On 11.03.2011 17:59, Tom Lane wrote:
> >>> But that will be fixed during WAL replay.
> >
> >> Not under the circumstances that started the original thread:
> >
> >> 1. Backend splits a page
> >> 2. Checkpoint starts
> >> 3. Checkpoint runs to completion
> >> 4. Crash
> >> (5. Backend never got to insert the parent pointer)
> >
> >> WAL replay starts at the checkpoint redo pointer, which is after the
> >> page split record, so WAL replay won't insert the parent pointer. That's
> >> an incredibly tight window to hit in practice, but it's possible in theory.
> >
> > Hmm. It's not so improbable that checkpoint would start inside that
> > window, but that the parent insertion is still pending by the time the
> > checkpoint finishes is pretty improbable.
> >
> > How about just reducing the deletion-time ERROR for missing downlink to a LOG?
>
> Well, the code that follows expects to have a valid parent page locked,
> so you can't literally do just that. But yeah, LOG and aborting the page
> deletion seems fine to me.

Added to TODO:

Fix problem with btree page splits during checkpoints

http://archives.postgresql.org/pgsql-hackers/2010-11/msg00052.php
http://archives.postgresql.org/pgsql-hackers/2011-09/msg00184.php

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian Pflug 2011-10-11 20:00:05 Re: Index only scan paving the way for "auto" clustered tables?
Previous Message Tom Lane 2011-10-11 19:57:26 Re: Range Types - typo + NULL string constructor