Re: visibility map corruption

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Floris Van Nee <florisvannee(at)optiver(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: visibility map corruption
Date: 2021-07-24 00:08:52
Message-ID: 20210724000852.GD8025@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 8, 2021 at 09:51:47AM -0400, Bruce Momjian wrote:
> On Thu, Jul 8, 2021 at 08:11:14AM -0500, Justin Pryzby wrote:
> > Also, the pg_upgrade status message still seems to be misplaced:
> >
> > In 20210706190612(dot)GM22043(at)telsasoft(dot)com, Justin Pryzby wrote:
> > > I re-arranged the pg_upgrade output of that patch: it was in the middle of the
> > > two halves: "Setting next transaction ID and epoch for new cluster"
> >
> > +++ b/src/bin/pg_upgrade/pg_upgrade.c
> > @@ -473,6 +473,12 @@ copy_xact_xlog_xid(void)
> > "\"%s/pg_resetwal\" -f -x %u \"%s\"",
> > new_cluster.bindir, old_cluster.controldata.chkpnt_nxtxid,
> > new_cluster.pgdata);
> > + check_ok();
> > + prep_status("Setting oldest XID for new cluster");
> > + exec_prog(UTILITY_LOG_FILE, NULL, true, true,
> > + "\"%s/pg_resetwal\" -f -u %u \"%s\"",
> > + new_cluster.bindir, old_cluster.controldata.chkpnt_oldstxid,
> > + new_cluster.pgdata);
> > exec_prog(UTILITY_LOG_FILE, NULL, true, true,
> > "\"%s/pg_resetwal\" -f -e %u \"%s\"",
> > new_cluster.bindir, old_cluster.controldata.chkpnt_nxtepoch,
>
> Wow, you are 100% correct. Updated patch attached.

OK, I have the patch ready to apply to all supported Postgres versions,
and it passes all my cross-version pg_upgrade tests.

However, I am now stuck on the commit message text, and I think this is
the point Peter Geoghegan was trying to make earlier --- while we know
that preserving the oldest xid in pg_control is the right thing to do,
and that setting it to the current xid - 2 billion (the old behavior)
causes vacuum freeze to run on all tables, but what else does this patch
affect?

As far as I know, seeing a very low oldest xid causes autovacuum to
check all objects and make sure their relfrozenxid is less then
autovacuum_freeze_max_age, but isn't that just a check? Would that
cause any table scans? I would think not. And would this cause
incorrect truncation of pg_xact or fsm or vm files? I would think not
too.

Even if the old and new cluster had mismatched autovacuum_freeze_max_age
values, I don't see how that would cause any corruption either.

I could perhaps see corruption happening if pg_control's oldest xid
value was closer to the current xid value than it should be, but I can't
see how having it 2-billion away could cause harm, unless perhaps
pg_upgrade itself used enough xids to cause the counter to wrap more
than 2^31 away from the oldest xid recorded in pg_control.

What I am basically asking is how to document this and what it fixes.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-07-24 00:47:18 Re: visibility map corruption
Previous Message Tom Lane 2021-07-23 22:18:06 Re: Followup Timestamp to timestamp with TZ conversion