Re: On-the-fly index tuple deletion vs. hot_standby

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: On-the-fly index tuple deletion vs. hot_standby
Date: 2011-06-12 04:15:29
Message-ID: BANLkTinfxPkhhStSPVQ-+O3J1_cYzGBgcA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 11, 2011 at 11:40 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> We currently achieve that wait-free by first marking the page with the next
> available xid and then reusing it when that mark (btpo.xact) predates the
> oldest running xid (RecentXmin).  (At the moment, I'm failing to work out why
> this is OK with scans from transactions that haven't allocated an xid, but I
> vaguely recall convincing myself it was fine at one point.)  It would indeed
> also be enough to call GetLockConflicts(locktag-of-index, AccessExclusiveLock)
> and check whether any of the returned transactions have PGPROC.xmin below the
> mark.  That's notably more expensive than just comparing RecentXmin, so I'm
> not sure how well it would pay off overall.  However, it could only help us on
> the master.  (Not strictly true, but any way I see to extend it to the standby
> has critical flaws.)  On the master, we can see a conflicting transaction and
> put off reusing the page.  By the time the record hits the standby, we have to
> apply it, and we might have a running transaction that will hold a lock on the
> index for the next, say, 72 hours.  At such times, vacuum_defer_cleanup_age or
> hot_standby_feedback ought to prevent the recovery stall.
>
> This did lead me to realize that what we do in this regard on the standby can
> be considerably independent from what we do on the master.  If fruitful, the
> standby can prove the absence of a scan holding a right-link in a completely
> different fashion.  So, we *could* take the cleanup-lock approach on the
> standby without changing very much on the master.

Well, I'm generally in favor of trying to fix this problem without
changing what the master does. It's a weakness of our replication
technology that the standby has no better way to cope with a cleanup
operation on the master than to start killing queries, but then again
it's a weakness of our MVCC technology that we don't reuse space
quickly enough and end up with bloat. I hear a lot more complaints
about the second weakness than I do about the first.

At any rate, if taking a cleanup lock on the right-linked page on the
standby is sufficient to fix the problem, that seems like a far
superior solution in any case. Presumably the frequency of someone
having a pin on that particular page will be far lower than any
matching based on XID or heavyweight locks. And the vast majority of
such pins should disappear before the startup process feels obliged to
get out its big hammer.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-06-12 04:21:38 Re: FOREIGN TABLE doc fix
Previous Message Robert Haas 2011-06-12 03:56:24 Re: psql: missing tab completions for COMMENT ON