Re: Avoid orphaned objects dependencies, take 3

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Alexander Lakhin <exclusion(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Avoid orphaned objects dependencies, take 3
Date: 2025-05-22 12:15:11
Message-ID: aC8VT2OjOyPJuKv/@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wed, May 21, 2025 at 10:17:58AM -0700, Jeff Davis wrote:
> On Wed, 2025-05-21 at 12:55 -0400, Robert Haas wrote:
> > Yeah, that's not a terrible idea. I still like the idea I thought
> > Bertrand was pursuing, namely, to take no lock in
> > recordDependencyOn()
> > but assert that the caller has previously acquired one. However, we
> > could still do the Assert() check with this design when NoLock is
> > passed, so I think this is a reasonable alternative to that design.
>
> I'd have to see the patch to see whether I liked the end result. But
> I'm guessing that involves a lot of non-mechanical changes in the call
> sites, and also relies on test coverage for all of them.

Thinking more about it, I'm not sure the NoLock/AccessShareLock will be easy
to make it right and maintain.

The thing is that in recordMultipleDependencies() we may iterate over multiple
referenced objects and those are the ones we want a lock on.

So if we enter recordMultipleDependencies() with "NoLock" we would need to be
100% sure that ALL the referenced objects are already locked (or that we don't
want to take a lock on ALL the referenced objects).

But that's not so easy, for example with something like:

create schema my_schema;
CREATE TABLE my_schema.test_maint(i INT);
INSERT INTO my_schema.test_maint VALUES (1), (2);
CREATE FUNCTION my_schema.fn(INT) RETURNS INT IMMUTABLE LANGUAGE plpgsql AS $$
BEGIN
RAISE NOTICE 'BDT';
RETURN $1;
END;
$$;

Then during:

CREATE MATERIALIZED VIEW my_schema.test_maint_mv AS SELECT my_schema.fn(i) FROM my_schema.test_maint;

1. The schema is locked
2. The relation is locked
3. The function is not locked

So we should pass "AccessShareLock" to recordMultipleDependencies() but that could
be easy to miss, resulting in passing "NoLock".

Furthermore, it means that even if we pass "AccessShareLock" correctly here, we'd
need to check that there is no existing lock on each referenced object (with
LockHeldByMe()/CheckRelationOidLockedByMe()) with a level > AccessShareLock (if
not, we'd add an extra lock without any good reason to do so).

With Robert's idea we could avoid to call LockDatabaseObject()/LockRelationOid()
if we know that the object/relation is already locked (or that we don't want
a lock at this place). But it would mean being 100% sure that if there are
multiple code paths leading to the same referenced object insertion location
then each of them have the same locking behavior.

As that seems hard, a better approach would probably be to also always call
LockHeldByMe()/CheckRelationOidLockedByMe() before trying to acquire the lock.

So I think:

- Jeff's idea could be hard to reason about, as "NoLock" could mean: we are sure
that ALL the existing referenced objects are already locked and "AccessShareLock"
would mean: we know that at least one referenced object needs an AccessShareLock.
Then that would mean that "by design" we'd need to check if there is no existing
lock before trying to acquire the AccessShareLock on the referenced objects.

- Robert's idea would still require that we check whether there is any existing
lock before acquiring the AccessShareLock (to be on the safe side of things).

So I wonder if, after all, it makes sense to simply try to acquire the
AccessShareLock on a referenced object in recordMultipleDependencies() IIF
this referenced object is not already locked (basically what was initially
proposed, but with this extra check added and without the "NoLock"/"lock"
addition to recordMultipleDependencies())).

That would probably address Robert's concern [1] "acquiring two locks on the same
object with different lock modes where we should really only acquire one" but
probably not this one "I think it might result in acquiring the
locks on those other objects at a subtly wrong time (leading to race
conditions)".

For the latter I'm not sure how that could be a subtly wrong time or how could
we determine what a subtly wrong time is (to avoid taking the lock).

Thoughts?

[1]: https://www.postgresql.org/message-id/CA%2BTgmoaCJ5-Zx5R0uN%2Bah4EZU1SamY1PneaW5O617FsNKavmfw%40mail.gmail.com

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-05-22 12:21:23 Re: Make wal_receiver_timeout configurable per subscription
Previous Message Erik Nordström 2025-05-22 12:13:39 Relstats after VACUUM FULL and CLUSTER