Re: pg15b2: large objects lost on upgrade

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Shruthi Gowda <gowdashru(at)gmail(dot)com>
Subject: Re: pg15b2: large objects lost on upgrade
Date: 2022-07-11 13:16:30
Message-ID: CA+Tgmobpm++X2g+SK6wqfHyTmyNcapoQtZbHJFMs-aq3ije=Aw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jul 10, 2022 at 9:31 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> Hmm. That would mean that the more LOs a cluster has, the more bloat
> there will be in the new cluster once the upgrade is done. That could
> be quite a few gigs worth of data laying around depending on the data
> inserted in the source cluster, and we don't have a way to know which
> files to remove post-upgrade, do we?

The files that are being leaked here are the files backing the
pg_largeobject table and the corresponding index as they existed in
the new cluster just prior to the upgrade. Hopefully, the table is a
zero-length file and the index is just one block, because you're
supposed to use a newly-initdb'd cluster as the target for a
pg_upgrade operation. Now, you don't actually have to do that: as
we've been discussing, there aren't as many sanity checks in this code
as there probably should be. But it would still be surprising to
initdb a new cluster, load gigabytes of large objects into it, and
then use it as the target cluster for a pg_upgrade.

As for whether it's possible to know which files to remove
post-upgrade, that's the same problem as trying to figure out whether
their are orphaned files in any other PostgreSQL cluster. We don't
have a tool for it, but if you're sure that the system is more or less
quiescent - no uncommitted DDL, in particular - you can compare
pg_class.relfilenode to the actual filesystem contents and figure out
what extra stuff is present on the filesystem level.

I am not saying we shouldn't try to fix this up more thoroughly, just
that I think you are overestimating the consequences.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-07-11 13:17:13 Re: CREATE TABLE ( .. STORAGE ..)
Previous Message Peter Eisentraut 2022-07-11 13:09:19 Re: [PATCH] Log details for client certificate failures