Avoid orphaned objects dependencies, take 3

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Avoid orphaned objects dependencies, take 3
Date: 2024-04-22 08:45:19
Message-ID: ZiYjn0eVc7pxVY45@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

This new thread is a follow-up of [1] and [2].

Problem description:

We have occasionally observed objects having an orphaned dependency, the
most common case we have seen is functions not linked to any namespaces.

Examples to produce such orphaned dependencies:

Scenario 1:

session 1: begin; drop schema schem;
session 2: create a function in the schema schem
session 1: commit;

With the above, the function created in session 2 would be linked to a non
existing schema.

Scenario 2:

session 1: begin; create a function in the schema schem
session 2: drop schema schem;
session 1: commit;

With the above, the function created in session 1 would be linked to a non
existing schema.

A patch has been initially proposed to fix this particular
(function-to-namespace) dependency (see [1]), but there could be much
more scenarios (like the function-to-datatype one highlighted by Gilles
in [1] that could lead to a function having an invalid parameter datatype).

As Tom said there are dozens more cases that would need to be
considered, and a global approach to avoid those race conditions should
be considered instead.

A first global approach attempt has been proposed in [2] making use of a dirty
snapshot when recording the dependency. But this approach appeared to be "scary"
and it was still failing to close some race conditions (see [2] for details).

Then, Tom proposed another approach in [2] which is that "creation DDL will have
to take a lock on each referenced object that'd conflict with a lock taken by
DROP".

This is what the attached patch is trying to achieve.

It does the following:

1) A new lock (that conflicts with a lock taken by DROP) has been put in place
when the dependencies are being recorded.

Thanks to it, the drop schema in scenario 2 would be locked (resulting in an
error should session 1 committs).

2) After locking the object while recording the dependency, the patch checks
that the object still exists.

Thanks to it, session 2 in scenario 1 would be locked and would report an error
once session 1 committs (that would not be the case should session 1 abort the
transaction).

The patch also adds a few tests for some dependency cases (that would currently
produce orphaned objects):

- schema and function (as the above scenarios)
- function and type
- table and type (which is I think problematic enough, as involving a table into
the game, to fix this stuff as a whole).

[1]: https://www.postgresql.org/message-id/flat/a4f55089-7cbd-fe5d-a9bb-19adc6418ae9(at)darold(dot)net#9af5cdaa9e80879beb1def3604c976e8
[2]: https://www.postgresql.org/message-id/flat/8369ff70-0e31-f194-2954-787f4d9e21dd%40amazon.com

Please note that I'm not used to with this area of the code so that the patch
might not take the option proposed by Tom the "right" way.

Adding the patch to the July CF.

Looking forward to your feedback,

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v1-0001-Avoid-orphaned-objects-dependencies.patch text/x-diff 13.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-04-22 08:49:16 Re: Patch to avoid orphaned dependencies
Previous Message jian he 2024-04-22 08:43:42 slightly misleading Error message in guc.c