Re: Vacuumlo improvements

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Shawn McCoy <shawn(dot)the(dot)mccoy(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Vacuumlo improvements
Date: 2026-05-13 15:28:42
Message-ID: CAA5RZ0swSOmfVOfTP=4uu_Hv9iQN1Tij4SogzZxJPPM8KnR=_g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > Ideally, vacuumlo could be improved to:
> > - Resolve domain types back to their base types when scanning columns
> > (using pg_type.typbasetype), or
> > - At least emit a WARNING when it encounters columns with domains over
> > oid/lo that it is skipping, so the user is aware.
>
> Commit 64c604898e added the note about domains to the docs. Unfortunately,
> neither that nor the corresponding thread [0] offer any clues as to why
> vacuumlo doesn't resolve domains. The commit history for vacuumlo has been
> pretty quiet for a long time, so maybe it's just been overlooked.
>
> > At minimum, I can submit a documentation improvement to make the
> > data-loss risk more prominent. The current parenthetical note is easy
> > to miss.
>
> Improving the documentation seems reasonable, too.

+1 to documentation that calls out the risk of data-loss.

> Another thing we could explore is allowing users to specify which
> tables/columns refer to LOs,
> perhaps with a user-provided query. One wrinkle is that dblink allows
> specifying multiple databases, and presumably each database will be a
> little different.
> Separately, do you know whether users are using lo_manage() at all? And if
> not, why?

I think recommending the use of the LO extension [1] in the core large object
documentation is a good start. Ideally, a user should not have to run vacuumlo.
Using the LO extension, a user can use lo_manage for simple types ( or domain
over simple types ) or if they have a more complex situation, like a composite
type holding an LO, they can use a custom trigger.

In the case of TRUNCATE, since per-row triggers don't fire. But even
that can be handled with a statement level BEFORE TRUNCATE trigger that scans
and unlinks. vacuumlo then becomes a cleanup tool for legacy schemas, not a
routine requirement.

All to say, we should be steering the users towards this extension with more
recommendations, perhaps.

[1] https://www.postgresql.org/docs/current/lo.html

--
Sami Imseih
Amazon Web Services (AWS)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2026-05-13 15:32:28 Re: egrep is obsolescent
Previous Message Tom Lane 2026-05-13 15:24:30 egrep is obsolescent