Re: [BUG] orphaned function

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gilles Darold <gilles(at)darold(dot)net>
Cc: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG] orphaned function
Date: 2020-12-17 23:26:48
Message-ID: 119576.1608247608@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Gilles Darold <gilles(at)darold(dot)net> writes:
> The same problem applies if the returned type or the procedural language
> is dropped. I have tried to fix that in ProcedureCreate() by using an
> AccessSharedLock for the data types and languages involved in the
> function declaration. With this patch the race condition with parameters
> types, return types and PL languages are fixed. Only data types and
> procedural languages with Oid greater than FirstBootstrapObjectId will
> be locked locked. But this is probably more complex than this fix so it
> is proposed as a separate patch
> (v1-003-orphaned_function_types_language.patch) to not interfere with
> the applying of Bertran's patch.

Indeed. This points up one of the things that is standing in the way
of any serious consideration of applying this patch. To my mind there
are two fundamental, somewhat interrelated considerations that haven't
been meaningfully addressed:

1. What set of objects is it worth trying to remove this race condition
for, and with respect to what dependencies? Bertrand gave no
justification for worrying about function-to-namespace dependencies
specifically, and you've likewise given none for expanding the patch
to consider function-to-datatype dependencies. There are dozens more
cases that could be considered; but I sure don't want to be processing
another ad-hoc fix every time someone thinks they're worried about
another specific case.

Taking a practical viewpoint, I'm far more worried about dependencies
of tables than those of any other kind of object. A messed-up function
definition doesn't cost you much: you can just drop and recreate the
function. A table full of data is way more trouble to recreate, and
indeed the data might be irreplaceable. So it seems pretty silly to
propose that we try to remove race conditions for functions' dependencies
on datatypes without removing the same race condition for tables'
dependencies on their column datatypes.

But any of these options lead to the same question: why stop there?
An approach that would actually be defensible, perhaps, is to incorporate
this functionality into the dependency mechanism: any time we're about to
create a pg_depend or pg_shdepend entry, take out an AccessShareLock on
the referenced object, and then check to make sure it still exists.
This would improve the dependency mechanism so it prevents creation-time
race conditions, not just attempts to drop dependencies of
already-committed objects. It would mean that the one patch would be
the end of the problem, rather than looking forward to a steady drip of
new fixes.

2. How much are we willing to pay for this added protection? The fact
that we've gotten along fine without it for years suggests that the
answer needs to be "not much". But I'm not sure that that actually
is the answer, especially if we don't have a policy that says "we'll
protect against these race conditions but no other ones". I think
there are possibly-serious costs in three different areas:

* Speed. How much do all these additional lock acquisitions slow
down a typical DDL command?

* Number of locks taken per transaction. This'd be particularly an
issue for pg_restore runs using single-transaction mode: they'd take
not only locks on the objects they create, but also a bunch of
reference-protection locks. It's not very hard to believe that this'd
make a difference in whether restoring a large database is possible
without increasing max_locks_per_transaction.

* Risk of deadlock. The reference locks themselves should be sharable,
so maybe there isn't much of a problem, but I want to see this question
seriously analyzed not just ignored.

Obviously, the magnitude of these costs depends a lot on how many
dependencies we want to remove the race condition for. But that's
exactly the reason why I don't want a piecemeal approach of fixing
some problems now and some other problems later. That's way too
much like the old recipe for boiling a frog: we could gradually get
into serious performance problems without anyone ever having stopped
to consider the issue.

In short, I think we should either go for a 100% solution if careful
analysis shows we can afford it, or else have a reasoned policy
why we are going to close these specific race conditions and no others
(implying that we'll reject future patches in the same area). We
haven't got either thing in this thread as it stands, so I do not
think it's anywhere near being ready to commit.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-12-17 23:41:01 Re: Refactoring HMAC in the core code
Previous Message Jeff Davis 2020-12-17 23:13:25 Re: Minor documentation error regarding streaming replication protocol