Re: pg_upgrade: transfer pg_largeobject_metadata's files when possible

From: Andres Freund <andres(at)anarazel(dot)de>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nitin Motiani <nitinmotiani(at)google(dot)com>, Hannu Krosing <hannuk(at)google(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_upgrade: transfer pg_largeobject_metadata's files when possible
Date: 2026-02-03 23:16:17
Message-ID: 3yd2ss6n7xywo6pmhd7jjh3bqwgvx35bflzgv3ag4cnzfkik7m@hiyadppqxx6w
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-09-08 14:20:33 -0500, Nathan Bossart wrote:
> On Sat, Sep 06, 2025 at 10:12:11AM +0900, Michael Paquier wrote:
> > Yep. This plan is safe to rely on.
>
> Committed, thanks for reviewing!

I just had occasion to look at this code and I found the way this works quite
confusing:

When upgrading from a new enough server, we migrate
pg_largeobject_metadata. So far so good. But we *also* do a COPY FROM
COPY "pg_catalog"."pg_largeobject_metadata" ("oid", "lomowner", "lomacl") FROM stdin;
...

for all the large objects that have a description or a security label.

For a while I was somewhat baffled, because that sure looks like it ought to
lead to uniqueness violations. But it doesn't.

The reason, I think, is that the COPY is happening into a relfilenode that
will be overwritten later, it doesn't yet contain the contents of the old
cluster.

Presumably we do this because we need the temporary pg_largeobject_metadata to
make COMMENT ON and security label commands not fail.

If this is the reasoning / how it works, shouldn't there be a comment in the
code or the commit message explaining that? Because it sure seems non-obvious
to me.

It's also not entirely obvious to me that this is safe - after all
(bbe08b8869bd, revised in 0e758ae89) appeared to have taken some pains to
ensure that the file gets unlinked immediately during the "binary upgrade
mode" TRUNCATE. But now we are actually filling that file again, after the
relation had been truncated?

Separately, if we don't handle large objects that don't have comments or
labels via pg_dump, why do we do dependency tracking for all LOs in getLOs(),
rather than just the ones that have a comment / label? Given how much memory
both the query results and the dependency tracking take...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikolay Samokhvalov 2026-02-03 23:42:16 Re: IO wait events for COPY FROM/TO PROGRAM or file
Previous Message Tom Lane 2026-02-03 22:52:15 Re: IPC::Run::time[r|out] vs our TAP tests