Re: Re: [COMMITTERS] pgsql: Augment WAL records for btree delete with GetOldestXmin() to

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Augment WAL records for btree delete with GetOldestXmin() to
Date: 2010-03-27 10:10:42
Message-ID: 1269684642.3684.1956.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Fri, 2010-03-26 at 16:16 -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > On Sun, 2010-01-31 at 23:43 +0200, Heikki Linnakangas wrote:
> >> When replaying the deletion record, the standby could look at all the
> >> heap tuples whose index pointers are being removed, to see which one
> >> was newest.
>
> > Long after coding this, I now realise this really is a dumb-ass idea.
>
> > There is no spoon. The index tuples did once point at valid heap tuples.
> > 1. heap tuples are deleted
> > 2. heap tuples become dead
> > 3. index tuples can now be marked killed
> > 4. index tuples removed
> > Heap tuples can be removed at step 2, index tuples can't be removed
> > until step 4.
>
> Uh, no, heap tuples can't be removed until after all index entries that
> are pointing at them have been removed. Please tell me you have not
> broken this.

Nothing broken.

It appears that in practice many of the index items point to heap items
that are LP_DEAD. So for the purposes of accessing a heap tuple's xmin,
then we're both right. To the current purpose the tuple has been
removed, though you are also right: its stub remains.

So how do I calculate xmin and xmax for an LP_DEAD tuple? Clearly
nothing can be done directly. Is there another way?

A conjecture: if the index items point to a heap tuple that is LP_NORMAL
then we can get the xmin/xmax from there. The xmin/xmax of LP_DEAD items
will always be *earlier* than the latest LP_NORMAL tuple that is being
removed. So as long as I have at least 1 LP_NORMAL heap tuple, then I
can use the latestRemovedXid from that and simply discard the LP_DEAD
items (for the purposes of this calculation). The idea is that whatever
marked those heap tuples LP_DEAD would also have marked the others, if
they were the same or earlier than the LP_DEAD ones.

Do you agree with this conjecture? If you do, then attached patch is
complete.

--
Simon Riggs www.2ndQuadrant.com

Attachment Content-Type Size
derive_latestRemovedXid_from_heap.patch text/x-patch 18.3 KB

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Greg Stark 2010-03-27 19:15:37 Re: [COMMITTERS] pgsql: Augment WAL records for btree delete with GetOldestXmin() to
Previous Message Tom Lane 2010-03-26 20:16:21 Re: Re: [COMMITTERS] pgsql: Augment WAL records for btree delete with GetOldestXmin() to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-03-27 11:13:21 Re: join removal
Previous Message Peter Eisentraut 2010-03-27 09:23:34 changes to documentation build