Re: pgsql: Fix a couple of bugs in MultiXactId freezing

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pgsql: Fix a couple of bugs in MultiXactId freezing
Date: 2013-12-03 10:56:07
Message-ID: 20131203105607.GB8924@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 2013-12-03 00:47:07 -0500, Noah Misch wrote:
> On Sat, Nov 30, 2013 at 01:06:09AM +0000, Alvaro Herrera wrote:
> > Fix a couple of bugs in MultiXactId freezing
> >
> > Both heap_freeze_tuple() and heap_tuple_needs_freeze() neglected to look
> > into a multixact to check the members against cutoff_xid.
>
> > ! /*
> > ! * This is a multixact which is not marked LOCK_ONLY, but which
> > ! * is newer than the cutoff_multi. If the update_xid is below the
> > ! * cutoff_xid point, then we can just freeze the Xmax in the
> > ! * tuple, removing it altogether. This seems simple, but there
> > ! * are several underlying assumptions:
> > ! *
> > ! * 1. A tuple marked by an multixact containing a very old
> > ! * committed update Xid would have been pruned away by vacuum; we
> > ! * wouldn't be freezing this tuple at all.
> > ! *
> > ! * 2. There cannot possibly be any live locking members remaining
> > ! * in the multixact. This is because if they were alive, the
> > ! * update's Xid would had been considered, via the lockers'
> > ! * snapshot's Xmin, as part the cutoff_xid.
>
> READ COMMITTED transactions can reset MyPgXact->xmin between commands,
> defeating that assumption; see SnapshotResetXmin(). I have attached an
> isolationtester spec demonstrating the problem.

Any idea how to cheat our way out of that one given the current way
heap_freeze_tuple() works (running on both primary and standby)? My only
idea was to MultiXactIdWait() if !InRecovery but that's extremly grotty.
We can't even realistically create a new multixact with fewer members
with the current format of xl_heap_freeze.

> The test spec additionally
> covers a (probably-related) assertion failure, new in 9.3.2.

Too bad it's too late to do anthing about it for 9.3.2. :(. At least the
last seems actually unrelated, I am not sure why it's 9.3.2
only. Alvaro, are you looking?

> That was the only concrete runtime problem I found during a study of the
> newest heap_freeze_tuple() and heap_tuple_needs_freeze() code.

I'd even be interested in fuzzy problems ;). If 9.3. wouldn't have been
released the interactions between cutoff_xid/multi would have caused me
to say "back to the drawing" board... I'm not suprised if further things
are lurking there.

> One thing that
> leaves me unsure is the fact that vacuum_set_xid_limits() does no locking to
> ensure a consistent result between GetOldestXmin() and GetOldestMultiXactId().
> Transactions may start or end between those calls, making the
> GetOldestMultiXactId() result represent a later set of transactions than the
> GetOldestXmin() result. I suspect that's fine. New transactions have no
> immediate effect on either cutoff, and transaction end can only increase a
> cutoff. Using a slightly-lower cutoff than the maximum safe cutoff is always
> okay; consider vacuum_defer_cleanup_age.

Yes, that seems fine to me, with the same reasoning.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Noah Misch 2013-12-03 14:16:18 Re: pgsql: Fix a couple of bugs in MultiXactId freezing
Previous Message Noah Misch 2013-12-03 05:47:07 Re: pgsql: Fix a couple of bugs in MultiXactId freezing

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2013-12-03 11:03:41 Skip hole in log_newpage
Previous Message Kyotaro HORIGUCHI 2013-12-03 10:15:53 Re: logical changeset generation v6.7