Re: pg_upgrade: transfer pg_largeobject_metadata's files when possible

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nitin Motiani <nitinmotiani(at)google(dot)com>, Hannu Krosing <hannuk(at)google(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_upgrade: transfer pg_largeobject_metadata's files when possible
Date: 2026-02-04 16:06:29
Message-ID: aYNuhbXkM5h8Zcsu@nathan
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 03, 2026 at 06:46:25PM -0500, Andres Freund wrote:
>> The reason, I think, is that the COPY is happening into a relfilenode that
>> will be overwritten later, it doesn't yet contain the contents of the old
>> cluster.
>>
>> Presumably we do this because we need the temporary pg_largeobject_metadata to
>> make COMMENT ON and security label commands not fail.
>>
>> If this is the reasoning / how it works, shouldn't there be a comment in the
>> code or the commit message explaining that? Because it sure seems non-obvious
>> to me.

Right, the COPY for LOs with comments and security labels is solely meant
to avoid failure when restoring the comments and security labels, since we
won't have transferred the relation files yet. This was the case before
commit 12a53c732c, where we had this comment in getBlobs():

* We *do* dump out the definition of the blob because we need that to
* make the restoration of the comments, and anything else, work since
* pg_upgrade copies the files behind pg_largeobject and
* pg_largeobject_metadata after the dump is restored.

Commit 3bcfcd815e (mine) added this one to pg_dump.c:

* If upgrading from v16 or newer, only dump large objects with
* comments/seclabels. For these upgrades, pg_upgrade can copy/link
* pg_largeobject_metadata's files (which is usually faster) but we
* still need to dump LOs with comments/seclabels here so that the
* subsequent COMMENT and SECURITY LABEL commands work. pg_upgrade
* can't copy/link the files from older versions because aclitem
* (needed by pg_largeobject_metadata.lomacl) changed its storage
* format in v16.

IIUC your critique is that this doesn't explain the overwriting behavior
like the older comment does. I'll work on adding that.

>> It's also not entirely obvious to me that this is safe - after all
>> (bbe08b8869bd, revised in 0e758ae89) appeared to have taken some pains to
>> ensure that the file gets unlinked immediately during the "binary upgrade
>> mode" TRUNCATE. But now we are actually filling that file again, after the
>> relation had been truncated?
>
> An example of what could go wrong:
>
> [... examples of what could go wrong ...]

I'm considering a couple of options here, but it seems like the easiest
thing to do is to move the TRUNCATE commands to the end of the dump file.
At least, that seems to be sufficient for our existing tests. If that
seems okay to you, I can work on putting together a patch.

--
nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2026-02-04 16:19:03 Re: Flush some statistics within running transactions
Previous Message Tomas Vondra 2026-02-04 15:59:31 Re: Non-deterministic buffer counts reported in execution with EXPLAIN ANALYZE BUFFERS