Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Date: 2014-07-20 19:55:25
Message-ID: 13210.1405886125@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I wrote:
> I started transcribing Bruce's proposed fix procedure at
> https://wiki.postgresql.org/wiki/20140702pg_upgrade_fix
> into the release notes, but I'm afraid it's all wet.

> He's suggesting copying the last checkpoint's NextMultiXactId into
> datminmxid/relminmxid, which is surely the wrong thing: that's likely to
> be newer than all mxids in the tables, not older than them. I thought at
> first that this was a simple thinko and he meant to write oldestMultiXid,
> but here's the thing: if we're in the situation where we've got
> wraparound, isn't oldestMultiXid going to be 1? The value recorded in the
> checkpoint isn't magic, it's just going to be extracted from whatever's in
> pg_database; and the whole problem here is that we can't trust that data.
> Where can we get a useful lower bound from?

Ugh: it's worse than that. pg_upgrade itself is using this utterly
nonsensical logic to set datminmxid/relminmxid. This is a stop-ship
issue for 9.3.5.

After some reflection it seems to me that we could estimate oldestmxid for
a pre-9.3 source cluster as the NextMultiXactId from its pg_control less
2000000000 or so. This will nearly always be much older than the actual
oldest mxid, but that's okay --- the next vacuuming cycle will advance the
datminmxid/relminmxid values to match reality, so long as they aren't
wrapped around already.

Note that there's already an assumption baked into pg_upgrade that 2E9
xids or mxids back is safely past the oldest actual data; see where it
sets autovacuum_freeze_max_age and autovacuum_multixact_freeze_max_age
while starting the new cluster.

(Hm ... I guess "2000000000 or so" actually needs to be a bit less than
that, otherwise autovacuum might kick off while we're munging the new
cluster.)

We could recommend the same estimate in the instructions about cleaning
up a previous pg_upgrade by hand.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2014-07-20 20:02:25 Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Previous Message Tom Lane 2014-07-20 17:37:01 Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts