Re: Event Triggers: adding information

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Event Triggers: adding information
Date: 2013-01-17 16:15:24
Message-ID: CA+TgmoYEW583z+oisToCYQMz2grzpavdXrCi3-W2D_+nucVhzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 17, 2013 at 5:18 AM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Now, if that's what it takes, I'll spend time on it. In which exact
> order do you want to be reviewing and applying that series of patches?

Let's agree on which things we even want to do first. Here's my take:

- adds ddl_command_trace and ddl_command_end events

I think ddl_command_end is OK and I'm willing to commit that if
extracted as its own patch. I think ddl_command_trace is unnecessary
syntactic sugar.

- causes those events to be called not only for actual SQL commands
but also for things that happen, at present, to go through the same
code path

I think this is a bad idea, not only because, as I said before, it
exposes internal implementation details, but also because it's
probably not safe. As Noah (I believe, or maybe Tom), observed in a
previous post, we've gone to a lot of work to make sure that nothing
you do inside a trigger can crash the server, corrupt the catalogs,
trigger elog() messages, etc. That applies in spades to DDL.

Despite Tom's skepticism, I'm pretty sure that it's safe for us to
execute trigger function calls either completely before or completely
after we execute the DDL command. It should be no different than
executing them as separate commands. But doing something in the
middle of a command is something else altogether. Consider, for
example, that ALTER TABLE does a bunch of crap internally to itself,
and then recurses into ProcessUtility to do some more stuff. This is
a terrible design, but it's what we've got. Are you absolutely sure
that the intermediate state that exists after that first bunch of
stuff is done and before those other things are done is a safe place
to execute arbitrary user-provided code? I'm not. And the more
places we add that recurse into ProcessUtility, or indeed, add event
triggers in general, the more places we're going to add new failure
modes.

Fixing that is part of the price of adding a new event trigger and I'm
OK with that. But I'm not OK with baking into the code the assumption
that any current or future callers of ProcessUtility() should be
prepared for arbitrary code execution. The code wasn't written with
that in mind, and it will not end well.

- adds additional magic variables to PL/pgsql to expose new
information not previously exposed

I'm not sure that I agree with the implementation of this, but I think
I'm OK with the concept. I will review it when it is submitted in a
form not intertwined with other things.

- adds CONTEXT as an additional event-trigger-filter variable, along with TAG

I'm OK with adding new event-trigger-filter variables, provided that
they are done without additional run-time overhead for people who are
not using event triggers; I think it's nice that we can support that.
But I think this particular proposal is DOA because nothing except
TOPLEVEL is actually safe.

>> As for the patch not being well-thought-out, I'm sticking by that,
>> too. Alvaro's comments about lock escalations should be enough to
>> tickle your alarm bells, but there are plenty of other problems here,
>> too.
>
> Here's the comment in the code where the lock problems are discussed:
>
> /*
> * Fill in the EventTriggerTargetOid for a DROP command and a
> * "ddl_command_start" Event Trigger: the lookup didn't happen yet and we
> * still want to provide the user with that information when possible.
> *
> * We only do the lookup if the command contains a single element, which is
> * often forced to be the case as per the grammar (see the drop_type:
> * production for a list of cases where more than one object per command is
> * allowed).
> *
> * We take no exclusive lock here, the main command will have to do the
> * name lookup all over again and take appropriate locks down after the
> * User Defined Code runs anyway, and the commands executed by the User
> * Defined Code will take their own locks. We only lock the objects here so
> * that they don't disapear under us in another concurrent transaction,
> * hence using ShareUpdateExclusiveLock. XXX this seems prone to lock
> * escalation troubles.
> */
>
> My thinking is that the user code can do anything, even run a DROP
> command against the object that we are preparing ourselves to be
> dropping. I don't think we can bypass doing the lookup twice.

Sure, but there's also the issue of getting the right answer. For
example, suppose you are planning to drop a function. You do a name
lookup on the name and call the event trigger. Meanwhile, someone
else renames the function and creates a new one with the same name.
After the event trigger returns, the actual drop code latches onto the
one with the new name. Now, the event trigger ran with OID A and the
actual object dropped was one with OID B. It's possible that KaiGai's
RENAME refactoring in 9.3 has strengthened the locking enough that
this particular failure mode is no longer possible in that specific
scenario, but I am fairly confident it would have been an issue in
9.2, and it may still be. In general, the only way to be sure that
things like that can't happen is to only do the name lookup once. Our
locking on non-table objects is still pretty flim-flammy, and there
are a lot of concurrent scenarios where things that seem like they
shouldn't change under you actually can.

There's another issue here, too, which is that this change
reintroduces a failure mode that I spent a lot of time cleaning up
during the 9.2 cycle - namely, taking locks on a table before
permissions have been checked, thus providing new methods for
unprivileged users to block operations on tables they don't have
rights to.

>> I'm very much opposed to that proposal, as I am to your proposal to
>> expose internal and generated events to users. Recursing into
>> ProcessUtility is a nasty, messy hook that is responsible for subtle
>> bugs and locking problems in our current DDL implementation. We
>
> We are publishing 2 pieces of information tied to the implementation in
> the current patch:
>
> - the parse tree, only available to C code, and without any API
> stability promises

I'm OK with that. People who program in C are taking their life in
their hands anyway as far as new versions are concerned.

> - the fact that somme commands are running nested commands: serial does
> a create sequence, create schema create table issues a create table.
>
> My understanding of your proposal not to call them ddl_command_* events
> is that you are exposing exactly the same set of implementation
> dependent information to the users, just under another name.

What I'm proposing is that we have a separate set of events that get
invoked every time an object is created, altered, or dropped,
regardless of the cause. Yes, the name will be different, but the
point is not to only change the name, but rather to also change the
definition. There are two separate things you might want here:

- to get a callback every time a command is executed (e.g. so you can
log it, check permissions, or reject the entire statement)
- to get a callback every time a certain kind of SQL object is
created, altered, or destroyed (e.g. so you can replicate those
operations on another server)

For example, suppose a user executes the command "DROP OWNED BY fred".
For a logging hook, you want the first thing: one callback, saying,
hey, the user is trying to run DROP OWNED BY fred. You log it once,
and go home. For a replication hook, you want the second thing: one
callback per object, with details about each one, so you can drop them
on the remote side. Both things are useful, but they are different.

As a further example, suppose that in 9.4 (or 9.17) we add a command
"DROP TABLES IN SCHEMA fred WHERE name LIKE 'bob%'". Well, the
logging trigger still just works (because it's only writing the
statement, without caring about its contents). And the replication
trigger still just works too (because it will still get a callback for
every table that gets dropped, even though it knows nothing about the
new syntax). That's very powerful. Of course, there will still be
cases where things have to change, but you can minimize it by clean
definitions.

It pains me that I've evidently failed to communicate this concept
clearly despite a year or more of trying. Does that make sense? Is
there some way I can make this more clear? The difference seems very
clear-cut to me, but evidently I'm not conveying it well.

> That problem is going to happen in some way or another. It's not about
> avoiding the problem completely, but choosing a set of compromises. I'm
> yet to understand what your set of compromises is in detailed enough
> practical implementation terms (not just "it's another set of events")
> and how it's a better compromise over all (I trust your judgement on
> that, I still want to understand).
>
> After all, the whole point of the Event Trigger patch is to expose some
> level of implementation details.

I don't really agree with that. I think the point is to expose what
the system is doing to the DBA. I'm OK with exposing the fact that
creating a table with a serial column also creates a sequence. There
is no problem with that - it's all good. What we DON'T want to do is
set up a system where the fact that it creates a sequence is exposed
because that happens to go through ProcessUtility() and the fact that
it creates a constraint is not because that happens not to go through
ProcessUtility(). That is not the sort of quality that our users have
come to expect from PostgreSQL.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-01-17 16:31:51 Re: Event Triggers: adding information
Previous Message Thom Brown 2013-01-17 16:03:55 Re: Materialized views WIP patch