Re: pg_upgrade bug found!

From: Noah Misch <noah(at)leadboat(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: RhodiumToad on IRC <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade bug found!
Date: 2011-04-07 19:44:58
Message-ID: 20110407194458.GA29297@tornado.gateway.2wire.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 07, 2011 at 12:16:55PM -0400, Bruce Momjian wrote:
> Bruce Momjian wrote:
> > OK, thanks to RhodiumToad on IRC, I was able to determine the cause of
> > the two reported pg_upgrade problems he saw via IRC. It seems toast
> > tables have xids and pg_dump is not preserving the toast relfrozenxids
> > as it should. Heap tables have preserved relfrozenxids, but if you
> > update a heap row but don't change the toast value, and the old heap row
> > is later removed, the toast table can have an older relfrozenxids than
> > the heap table.
> >
> > The fix for this is to have pg_dump preserve toast relfrozenxids, which
> > can be easily added and backpatched. We might want to push a 9.0.4 for
> > this. Second, we need to find a way for people to detect and fix
> > existing systems that have this problem, perhaps looming when the
> > pg_class relfrozenxid passes the toast relfrozenxid, and thirdly, we
> > need to figure out how to get this information to users. Perhaps the
> > communication comes through the 9.0.4 release announcement.
>
> I am not sure how to interpret the lack of replies to this email.
> Either it is confidence, shock, or we told you so. ;-)

Your explanation and patch make sense. Seems all too clear in retrospect.

> Any idea how to correct existing systems? Would VACUUM FREEZE of just
> the toast tables work? I perhaps could create a short DO block that
> would vacuum freeze just toast tables; it would have to be run in every
> database.

I see three cases:

1) The pg_class.relfrozenxid that the TOAST table should have received ("true
relfrozenxid") is still covered by available clog files. Fixable with some
combination of pg_class.relfrozenxid twiddling and "SET vacuum_freeze_table_age
= 0; VACUUM toasttbl".

2) The true relfrozenxid is no longer covered by available clog files. The fix
for case 1 will get "file "foo" doesn't exist, reading as zeroes" log messages,
and we will treat all transactions as uncommitted. Not generally fixable after
that has happened. We could probably provide a recipe for checking whether it
could have happened given access to a backup from just before the upgrade.

3) Enough transaction xids have elapsed such that the true relfrozenxid is again
covered by clog files, but those records are unrelated to the original
transactions. Actually, I don't think this can happen, even with the maximum
autovacuum_freeze_max_age.

I haven't tested those, so I'm sure there's some error in that assessment.

nm

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-04-07 19:46:26 Re: pg_upgrade bug found!
Previous Message Jeff Davis 2011-04-07 19:38:08 Re: pg_upgrade bug found!