Re: pg_upgrade bug found!

From: Noah Misch <noah(at)leadboat(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: RhodiumToad on IRC <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade bug found!
Date: 2011-04-08 11:08:13
Message-ID: 20110408110813.GA27915@tornado.gateway.2wire.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 07, 2011 at 10:21:06PM -0400, Bruce Momjian wrote:
> Noah Misch wrote:
> > 1) The pg_class.relfrozenxid that the TOAST table should have received
> > ("true relfrozenxid") is still covered by available clog files. Fixable
> > with some combination of pg_class.relfrozenxid twiddling and "SET
> > vacuum_freeze_table_age = 0; VACUUM toasttbl".
>
> Right, VACUUM FREEZE. I now see I don't need to set
> vacuum_freeze_table_age if I use the FREEZE keyword, e.g. gram.y has:
>
> if (n->options & VACOPT_FREEZE)
> n->freeze_min_age = n->freeze_table_age = 0;

True; it just performs more work than strictly necessary. We don't actually
need earlier-than-usual freezing. We need only ensure that the relfrozenxid
will guide future VACUUMs to do that freezing early enough. However, I'm not
sure how to do that without directly updating relfrozenxid, so it's probably
just as well to cause some extra work and stick to the standard interface.

> > 2) The true relfrozenxid is no longer covered by available clog files.
> > The fix for case 1 will get "file "foo" doesn't exist, reading as
> > zeroes" log messages, and we will treat all transactions as uncommitted.
>
> Uh, are you sure? I think it would return an error message about a
> missing clog file for the query; here is a report of a case not related
> to pg_upgrade:
>
> http://archives.postgresql.org/pgsql-admin/2010-09/msg00109.php

My statement was indeed incorrect. (Was looking at the "reading as zeroes"
message in slru.c, but it only applies during recovery.)

> > Not generally fixable after that has happened. We could probably
> > provide a recipe for checking whether it could have happened given
> > access to a backup from just before the upgrade.
>
> The IRC folks pulled the clog files off of backups.

Since we do get the error after all, that should always be enough.

> One concern I have is that existing heap tables are protecting clog
> files, but once those are frozen, the system might remove clog files not
> realizing it has to freeze the heap tables too.

Yes. On a similar note, would it be worth having your prototype fixup script
sort the VACUUM FREEZE calls in descending relfrozenxid order?

Thanks,
nm

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2011-04-08 11:46:48 Re: Typed-tables patch broke pg_upgrade
Previous Message Leonardo Francalanci 2011-04-08 10:01:38 switch UNLOGGED to LOGGED