Vacuumlo improvements

From: Shawn McCoy <shawn(dot)the(dot)mccoy(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Vacuumlo improvements
Date: 2026-05-12 17:34:10
Message-ID: CALsgZNAM=AYK-9ZLR7Z0YLx6Lyx5aSrjjse3T+FmsJ=jTvfhDQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'd like to discuss a behavior in the vacuumlo utility that can lead
to silent data loss when large object references are stored in columns
whose type is a domain over `oid` or `lo`. While fully stated in the
docs, we have observed users getting some surprises when they are
trying to do routine maintenance. I'll attach a very simple repro
that displays the behavior using
a fairly routine use of a domain. [1]

DESCRIPTION
---------------------
The vacuumlo documentation [2] states:

"Only types with these names are considered; in particular, domains
over them are not considered."

While the behavior is documented, in the field, the consequence is
severe: if a user creates a domain over `oid` (e.g., for semantic
clarity or to add constraints) and uses that domain-typed column to
store large object OIDs, vacuumlo will treat those LOs as orphaned and
delete them. The referenced data is silently destroyed.

This is particularly dangerous because:
- Domains over base types are a popular PostgreSQL practice
- There is no warning or diagnostic output from vacuumlo when it skips
domain-typed columns
- The --dry-run / -n flag will show these LOs as "would be removed"
but nothing indicates *why* they appear orphaned
- The data loss is irreversible

SUGGESTION
---------------------
Ideally, vacuumlo could be improved to:
- Resolve domain types back to their base types when scanning columns
(using pg_type.typbasetype), or
- At least emit a WARNING when it encounters columns with domains over
oid/lo that it is skipping, so the user is aware.

I don't currently have a patch attached, but wanted to shine light on
the issue given the silent data-loss risk. I know LO's are a sensitive
topic with discussions that wander towards deprecation, but they seem
to be here to stay and are very commonly used in the field.

At minimum, I can submit a documentation improvement to make the
data-loss risk more prominent. The current parenthetical note is easy
to miss.

[1] attached vacuumlo_repro.txt
[2] https://www.postgresql.org/docs/current/vacuumlo.html

Thanks,
Shawn

Attachment Content-Type Size
vacuumlo_repro.txt text/plain 1.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message SATYANARAYANA NARLAPURAM 2026-05-12 17:41:53 Re: [Bug]Assertion failure in LATERAL GRAPH_TABLE with multi-label pattern
Previous Message Alena Rybakina 2026-05-12 17:19:27 Re: pull-up subquery if JOIN-ON contains refs to upper-query