Re: pgsql: Fix a couple of bugs in MultiXactId freezing

From: Noah Misch <noah(at)leadboat(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pgsql: Fix a couple of bugs in MultiXactId freezing
Date: 2013-12-03 14:16:18
Message-ID: 20131203141618.GC1163520@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Tue, Dec 03, 2013 at 11:56:07AM +0100, Andres Freund wrote:
> On 2013-12-03 00:47:07 -0500, Noah Misch wrote:
> > On Sat, Nov 30, 2013 at 01:06:09AM +0000, Alvaro Herrera wrote:
> > > Fix a couple of bugs in MultiXactId freezing
> > >
> > > Both heap_freeze_tuple() and heap_tuple_needs_freeze() neglected to look
> > > into a multixact to check the members against cutoff_xid.
> >
> > > ! /*
> > > ! * This is a multixact which is not marked LOCK_ONLY, but which
> > > ! * is newer than the cutoff_multi. If the update_xid is below the
> > > ! * cutoff_xid point, then we can just freeze the Xmax in the
> > > ! * tuple, removing it altogether. This seems simple, but there
> > > ! * are several underlying assumptions:
> > > ! *
> > > ! * 1. A tuple marked by an multixact containing a very old
> > > ! * committed update Xid would have been pruned away by vacuum; we
> > > ! * wouldn't be freezing this tuple at all.
> > > ! *
> > > ! * 2. There cannot possibly be any live locking members remaining
> > > ! * in the multixact. This is because if they were alive, the
> > > ! * update's Xid would had been considered, via the lockers'
> > > ! * snapshot's Xmin, as part the cutoff_xid.
> >
> > READ COMMITTED transactions can reset MyPgXact->xmin between commands,
> > defeating that assumption; see SnapshotResetXmin(). I have attached an
> > isolationtester spec demonstrating the problem.
>
> Any idea how to cheat our way out of that one given the current way
> heap_freeze_tuple() works (running on both primary and standby)? My only
> idea was to MultiXactIdWait() if !InRecovery but that's extremly grotty.
> We can't even realistically create a new multixact with fewer members
> with the current format of xl_heap_freeze.

Perhaps set HEAP_XMAX_LOCK_ONLY on the tuple? We'd then ensure all update XID
consumers check HEAP_XMAX_IS_LOCKED_ONLY() first, much like xmax consumers are
already expected to check HEAP_XMAX_INVALID first. Seems doable, albeit yet
another injection of complexity.

> > The test spec additionally
> > covers a (probably-related) assertion failure, new in 9.3.2.
>
> Too bad it's too late to do anthing about it for 9.3.2. :(. At least the
> last seems actually unrelated, I am not sure why it's 9.3.2
> only. Alvaro, are you looking?

(For clarity, the other problem demonstrated by the test spec is also a 9.3.2
regression.)

> > That was the only concrete runtime problem I found during a study of the
> > newest heap_freeze_tuple() and heap_tuple_needs_freeze() code.
>
> I'd even be interested in fuzzy problems ;). If 9.3. wouldn't have been
> released the interactions between cutoff_xid/multi would have caused me
> to say "back to the drawing" board... I'm not suprised if further things
> are lurking there.

heap_freeze_tuple() of 9.2 had an XXX comment about the possibility of getting
spurious lock contention due to wraparound of the multixact space. The
comment is gone, and that mechanism no longer poses a threat. However, a
non-wrapped multixact containing wrapped locker XIDs (we don't freeze locker
XIDs, just updater XIDs) may cause similar spurious contention.

> + /*
> + * The multixact has an update hidden within. Get rid of it.
> + *
> + * If the update_xid is below the cutoff_xid, it necessarily
> + * must be an aborted transaction. In a primary server, such
> + * an Xmax would have gotten marked invalid by
> + * HeapTupleSatisfiesVacuum, but in a replica that is not
> + * called before we are, so deal with it in the same way.
> + *
> + * If not below the cutoff_xid, then the tuple would have been
> + * pruned by vacuum, if the update committed long enough ago,
> + * and we wouldn't be freezing it; so it's either recently
> + * committed, or in-progress. Deal with this by setting the
> + * Xmax to the update Xid directly and remove the IS_MULTI
> + * bit. (We know there cannot be running lockers in this
> + * multi, because it's below the cutoff_multi value.)
> + */
> +
> + if (TransactionIdPrecedes(update_xid, cutoff_xid))
> + {
> + Assert(InRecovery || TransactionIdDidAbort(update_xid));
> + freeze_xmax = true;
> + }
> + else
> + {
> + Assert(InRecovery || !TransactionIdIsInProgress(update_xid));

This assertion is at odds with the comment, but the assertion is okay for now.
If the updater is still in progress, its OldestMemberMXactId[] entry will have
held back cutoff_multi, and we won't be here. Therefore, if we get here, the
tuple will always be HEAPTUPLE_RECENTLY_DEAD (recently-committed updater) or
HEAPTUPLE_LIVE (aborted updater, recent or not).

Numerous comments in the vicinity (e.g. ones at MultiXactStateData) reflect a
pre-9.3 world. Most or all of that isn't new with the patch at hand, but it
does complicate study.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2013-12-03 14:48:23 Re: pgsql: Fix a couple of bugs in MultiXactId freezing
Previous Message Andres Freund 2013-12-03 10:56:07 Re: pgsql: Fix a couple of bugs in MultiXactId freezing

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2013-12-03 14:20:10 Re: Extension Templates S03E11
Previous Message Stephen Frost 2013-12-03 13:54:50 Re: Trust intermediate CA for client certificates