VACUUM FULL FREEZE is unsafe

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: VACUUM FULL FREEZE is unsafe
Date: 2004-11-27 21:41:51
Message-ID: 12179.1101591711@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The point of VACUUM FREEZE is to ensure that there are no tuples
present in the database whose commit status depends on "normal" XIDs.
Without this guarantee, cloning template0 might stop working once
the relevant part of pg_clog has been pruned.

If one combines freezing with moving tuples across pages (ie,
VACUUM FULL FREEZE), then the commit status of moved tuples may
depend on the vacuum's own XID (stored in XVAC). To maintain the
freeze safety guarantee, we'd want to be sure that upon successful
completion of the VACUUM, there are no moved tuples that haven't had
their status hint bits updated to XMIN_COMMITTED or XMIN_INVALID.

After some digging through vacuum.c, I have convinced myself that
this does occur for all tuples moved down from the end of the table.
update_hint_bits() takes care of all MOVED_IN rows; MOVED_OFF rows
in the page that becomes the physically last page of the table are
fixed near the bottom of repair_frag(); and MOVED_OFF rows in
pages after that don't matter because we'll truncate those pages
away entirely.

Unfortunately this still leaves one case uncovered, which is a tuple
that is moved because it is part of an update chain. If an original
tuple in an update chain is in a page that is below the new end of
the table, and was not a move target page (eg because it had no free
space), then that tuple will never be visited to change its state from
MOVED_OFF to XMIN_INVALID.

This doesn't break initdb, because there will be no update-chain cases
since no other transactions can be running. But it poses a nasty hazard
for anyone who is updating and re-freezing a template database during
normal operations (as for example in following the manual bug fix
procedures we had to recommend for some of the 7.4 dot releases).

Also, even though I don't see any failure cases for initdb, it seems
awfully risky to assume that this is all going to work 100%; and if
initdb did leave any improperly frozen tuples behind, it's quite likely
we'd not notice the error until the code got into the field.

ISTM that the safer way to handle this is VACUUM FULL (to compact)
and then VACUUM FREEZE (to freeze). It's much clearer that lazy VACUUM
can handle freezing reliably, because it never tries to move tuples
around.

Just doing this in initdb is a one-liner change, but I'm wondering if we
ought to enforce that FULL and FREEZE not be specified at the same time,
so that people couldn't risk such a problem in manual freezing of
template databases.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Page 2004-11-27 21:59:45 Re: [pgsql-hackers-win32] [BUGS] pg_autovacuum in 8beta-dev3 small bug
Previous Message Patrick B Kelly 2004-11-27 20:04:12 Re: Bitmap index