Re: On-the-fly index tuple deletion vs. hot_standby

From: Noah Misch <noah(at)leadboat(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: On-the-fly index tuple deletion vs. hot_standby
Date: 2010-12-10 17:55:04
Message-ID: 20101210175504.GA19580@tornado.gateway.2wire.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 09, 2010 at 09:48:25AM +0000, Simon Riggs wrote:
> On Fri, 2010-12-03 at 21:43 +0200, Heikki Linnakangas wrote:
> > On 29.11.2010 08:10, Noah Misch wrote:
> > > I have a hot_standby system and use it to bear the load of various reporting
> > > queries that take 15-60 minutes each. In an effort to avoid long pauses in
> > > recovery, I set a vacuum_defer_cleanup_age constituting roughly three hours of
> > > the master's transactions. Even so, I kept seeing recovery pause for the
> > > duration of a long-running query. In each case, the culprit record was an
> > > XLOG_BTREE_DELETE arising from on-the-fly deletion of an index tuple. The
> > > attached test script demonstrates the behavior (on HEAD); the index tuple
> > > reclamation conflicts with a concurrent "SELECT pg_sleep(600)" on the standby.
> > >
> > > Since this inserting transaction aborts, HeapTupleSatisfiesVacuum reports
> > > HEAPTUPLE_DEAD independent of vacuum_defer_cleanup_age. We go ahead and remove
> > > the index tuples. On the standby, btree_xlog_delete_get_latestRemovedXid does
> > > not regard the inserting-transaction outcome, so btree_redo proceeds to conflict
> > > with snapshots having visibility over that transaction. Could we correctly
> > > improve this by teaching btree_xlog_delete_get_latestRemovedXid to ignore tuples
> > > of aborted transactions and tuples inserted and deleted within one transaction?
>
> @Noah Easily the best bug reported submitted in a long time. Thanks.
>
> > Seems reasonable. HeapTupleHeaderAdvanceLatestRemovedXid() will need
> > similar treatment. Actually, btree_xlog_delete_get_latestRemovedXid()
> > could just call HeapTupleHeaderAdvanceLatestRemoveXid().
>
> Yes, it applies to other cases also. Thanks for the suggestion.
>
> Fix committed. Please double-check my work, committed early since I'm
> about to jump on a plane.

Thanks for making that change. For my understanding, why does the xmin == xmax
special case in HeapTupleHeaderAdvanceLatestRemoveXid not require !HEAP_UPDATED,
as the corresponding case in HeapTupleSatisfiesVacuum requires? I can neither
think of a recipe for triggering a problem as the code stands, nor come up with
a sound explanation for why no such recipe can exist.

nm

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitriy Igrishin 2010-12-10 17:58:12 Re: Fwd: Extended query protocol and exact types matches.
Previous Message Josh Berkus 2010-12-10 17:54:46 Re: SynchRep; wait-forever and shutdown