Re: Avoid orphaned objects dependencies, take 3

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Roman Eskin <r(dot)eskin(at)arenadata(dot)io>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Alexander Lakhin <exclusion(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Avoid orphaned objects dependencies, take 3
Date: 2026-04-28 11:17:17
Message-ID: afCXPVNuvpf5ECGZ@bdtpg
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wed, Apr 15, 2026 at 06:36:49PM +0000, Bertrand Drouvot wrote:
> Hi,
>
> On Sun, Nov 09, 2025 at 06:33:39PM +1000, Roman Eskin wrote:
> > Hi everybody,
> >
> > Apologies for jumping into this conversation.
>
> No problem at all, you're more than welcome to join it!
>
> > Our customers have also encountered a similar issue with the
> > concurrent drop of a dependent object.
>
> Yeah, it's not something so rare that one could thought (we also see a non
> negligible of them in our fleet).
>
> > In our code (in the Greengage
> > DB), we implemented a fix analogous to one of the first versions of
> > Bertrand's approach, using locking within the pg_depend code.
> >
> > In the long term, however, we are interested in aligning with the
> > upstream Postgres code as much as possible.
> >
> > Since there haven't been any recent updates in this thread, we were
> > wondering if there are any plans for the next steps regarding this
> > issue.
>
> Apologies for this late reply. I plan to work on this again in the coming weeks.
> The next step I've in mind is to build the list of the P1 cases described above.
>
> Once we've that list I think we could start discussing the next steps.

I've been able to identify the P1 cases (lock after acl check) thanks to 0003
attached (more on that later). With the list at hands, v20 attached is made of
3 sub patches.

0001: Avoid orphaned objects dependencies

Fix by acquiring AccessShareLock on referenced objects when recording
dependencies. This conflicts with AccessExclusiveLock taken by DROP,
preventing the race. After acquiring the lock, verify the object still
exists, if it was dropped concurrently, report an error.

The lock and check is done in both recordMultipleDependencies() and
changeDependencyFor().

Remarks:

1/ If the caller already holds a lock that conflicts with DROP (AccessShareLock
or stronger), skip the lock acquisition entirely.

2/ That does not address Robert's concern about lock after acl check cases and
that's what 0002 and 0003 are for.

0002: Lock referenced objects before permission checks in DDL commands

Add LockNotPinnedObjectById() calls before object_aclcheck() at all caller
sites where a permission check on a referenced object occurs before the
dependency on that object is recorded. This converts permission check before
lock to lock before permission check. It closes the TOCTOU window that could
allow a REVOKE to succeed between the permission check and the dependency
recording.

Note that the permission check before lock cases fixed in 0002 are already present.
0001 just makes the TOCTOU window wider in case there is a concurrent drop that
would block the dependency recording. So, I think that 0002 has merit on its own.

0003: Add Assert guard to detect permission check before lock regressions

Add instrumentation under USE_ASSERT_CHECKING to detect cases where object_aclcheck()
is called on a referenced object before a lock is held on it, which would widen
the TOCTOU window between the permission check and the dependency recording.

In recordMultipleDependencies() and changeDependencyFor(), for each referenced
object that had a permission check earlier in the same statement, assert that a
lock is already held. This catches any future code that adds a permission check
on a referenced object without first acquiring a lock.

This is enabled only for Assert enabled builds so that there is no overhead
for production builds.

I think that v20 is taking care of what has been discussed in this thread, mainly:

- It adds the new lock when the dependency is being recorded and thus avoids
having to modify about 250 call sites (as mentioned by Jeff in [1]).
- It addresses Robert's concern about the permission check before lock TOCTOU
window (thanks to 0002 that modifies about 70 call sites and 0003 that ensures
that we should trap new ones if any).

Thoughts?

[1]: https://postgr.es/m/d721011cd3ec3aedd57b193ef10cf541f50df325.camel%40j-davis.com

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v20-0001-Avoid-orphaned-objects-dependencies.patch text/x-diff 22.7 KB
v20-0002-Lock-referenced-objects-before-permission-checks.patch text/x-diff 33.2 KB
v20-0003-Add-Assert-guard-to-detect-permission-check-befo.patch text/x-diff 9.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2026-04-28 11:17:54 Re: oauth integer overflow
Previous Message Dilip Kumar 2026-04-28 11:03:34 Include schema-qualified names in publication error messages.