Re: pg_upgrade allows itself to be run twice

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_upgrade allows itself to be run twice
Date: 2022-06-29 04:17:33
Message-ID: YrvSXbpOppG7jfum@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 25, 2022 at 11:04:37AM -0500, Justin Pryzby wrote:
> I expect pg_upgrade to fail if I run it twice in a row.

Yep.

> It would be reasonable if it were to fail during the "--check" phase,
> maybe by failing like this:
> | New cluster database "..." is not empty: found relation "..."

So, we get a complaint that the new cluster is not empty after one
pg_upgrade run with a new command of pg_upgrade, with or without
--check. This happens in check_new_cluster(), where we'd fatal if a
relation uses a namespace different than pg_catalog.

> But that fails to happen if the cluster has neither tables nor matviews, in
> which case, it passes the check phase and then fails like this:

Indeed, as of get_rel_infos().

> I'll concede that a cluster which has no tables sounds a lot like a toy, but I
> find it disturbing that nothing prevents running the process twice, up to the
> point that it's evidently corrupted the catalog.

I have heard of cases where instances were only used with a set of
foreign tables, for example. Not sure that this is spread enough to
worry about, but this would fail as much as your case.

> While looking at this, I noticed that starting postgres --single immediately
> after initdb allows creating objects with OIDs below FirstNormalObjectId
> (thereby defeating pg_upgrade's check). I'm not familiar with the behavioral
> differences of single user mode, and couldn't find anything in the
> documentation.

This one comes from NextOID in the control data file after a fresh
initdb, and GetNewObjectId() would enforce that in a postmaster
environment to be FirstNormalObjectId when assigning the first user
OID. Would you imply an extra step at the end of initdb to update the
control data file of the new cluster to reflect FirstNormalObjectId?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-06-29 04:35:44 Re: Allowing REINDEX to have an optional name
Previous Message Thomas Munro 2022-06-29 04:00:32 Re: margay fails assertion in stats/dsa/dsm code