Skip site navigation (1) Skip section navigation (2)

Re: Hot standby and b-tree killed items

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Greg Stark <greg(dot)stark(at)enterprisedb(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby and b-tree killed items
Date: 2008-12-30 15:17:38
Message-ID: 1230650258.4793.1320.camel@ebony.2ndQuadrant (view raw or flat)
Thread:
Lists: pgsql-hackers
On Fri, 2008-12-19 at 09:22 -0500, Greg Stark wrote:

> I'm confused shouldn't read-only transactions on the slave just be  
> hacked to not set any hint bits including lp_delete?

It seems there are multiple issues involved and I saw only the first of
these initially. I want to explicitly separate these issues so we can
discuss them more easily.

1. When we replay an XLOG_BTREE_DELETE record, we may have to
wait-then-cancel-etc other sessions.

Possibly a pain, but these records are not very common now that we have
HOT, except on certain kinds of queue table.

2. Should we ignore the LP_DEAD flag on btree rows when we are using the
index during recovery? As Heikki points out, this hint bit is not WAL
logged, but can appear in the standby as a result of full page writes.
The LP_DEAD flags will have been set using a different xmin to the one
on the standby and would cause index rows to be ignored that should have
been included in a correct MVCC answer.

So we need to either

(a) always ignore LP_DEAD flags we see when reading index during
recovery.

(b) include an additional step to clean the full page writes to remove
LP_DEAD hints from the incoming pages.

(b) is feasible, but would need to be repeated each time a new full page
arrived, so a page may need to be re-cleaned many times. Sounds like a
bad plan, so we should choose (a).

3. Should we set LP_DELETE flag on btree rows when we are using the
index during recovery? Not much point if we are ignoring them.

There is no space for an additional flag, to distinguish between primary
and standby hint bits.


Issues (2) and (3) would go away entirely if both standby and primary
always had the same xmin value as a system-wide setting. i.e. the
standby and primary are locked together at their xmins. Perhaps that was
Heikki's intention in recent suggestions? 

-- 
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


In response to

Responses

pgsql-hackers by date

Next:From: Heikki LinnakangasDate: 2008-12-30 15:20:36
Subject: Re: Hot standby and b-tree killed items
Previous:From: Simon RiggsDate: 2008-12-30 15:05:59
Subject: LP_DELETE

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group