Re: pg15b2: large objects lost on upgrade

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Shruthi Gowda <gowdashru(at)gmail(dot)com>
Subject: Re: pg15b2: large objects lost on upgrade
Date: 2022-07-29 19:10:09
Message-ID: CA+TgmoYKVbB9rt2UAvaFLG-qYCgRuhAr7wPLDwTT0VpbPKvj_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 29, 2022 at 2:35 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> But what exactly is this test case testing? I've previously complained
> about buildfarm outputs not being as labelled as well as they need to
> be in order to be easily understood by, well, me anyway. It'd sure
> help if the commands that led up to this problem were included in the
> output. I downloaded latest-client.tgz from the build farm server and
> am looking at TestUpgradeXversion.pm, but there's no mention of
> -amcheck-1.log in there, just -analyse.log, -copy.log, and following.
> So I suppose this is running some different code or special
> configuration...

I was able to reproduce the problem by running 'make installcheck'
against a 9.4 instance and then doing a pg_upgrade to 16devel (which
took many tries because it told me about many different kinds of
things that it didn't like one at a time; I just dropped objects from
the regression DB until it worked). The dump output looks like this:

-- For binary upgrade, set pg_largeobject relfrozenxid and relminmxid
UPDATE pg_catalog.pg_class
SET relfrozenxid = '0', relminmxid = '0'
WHERE oid = 2683;
UPDATE pg_catalog.pg_class
SET relfrozenxid = '990', relminmxid = '1'
WHERE oid = 2613;

-- For binary upgrade, preserve pg_largeobject and index relfilenodes
SELECT pg_catalog.binary_upgrade_set_next_index_relfilenode('12364'::pg_catalog.oid);
SELECT pg_catalog.binary_upgrade_set_next_heap_relfilenode('12362'::pg_catalog.oid);
TRUNCATE pg_catalog.pg_largeobject;

However, the catalogs show the relfilenode being correct, and the
relfrozenxid set to a larger value. I suspect the problem here is that
this needs to be done in the other order, with the TRUNCATE first and
the update to the pg_class columns afterward.

I think I better look into improving the TAP tests for this, too.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2022-07-29 19:16:51 Re: Oversight in slab.c SlabContextCreate(), initial memory allocation size is not populated to context->mem_allocated
Previous Message Peter Geoghegan 2022-07-29 18:48:27 Re: Maximize page freezing