Re: Command Triggers patch v18

From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers patch v18
Date: 2012-03-29 20:21:36
Message-ID: m2fwcrp38v.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Well, preceding and before are synonyms, so I don't see any advantage
> in that change. But I really did mean AT permissions_checking time,
> not before or after it. That is, we'd have a hook where instead of
> doing something like this:
>
> aclresult = pg_class_aclcheck(reloid, GetUserId(), ACL_SELECT);
>
> ...we'd call a superuser-supplied trigger function and let it make the

Wow. I don't think I like that at all. It's indeed powerful, but how do
you go explaining and fixing unanticipated behavior with such things in
production? It looks too much like an invitation to break a very careful
design where each facility has to rove itself to get in.

So yes, that's not at all what I was envisioning, I still think that
most event specs for event triggers are going to have to happen in
between our different internal implementation of given facility.

Maybe we should even use BEFORE in cases I had in mind and AT for cases
like the one you're picturing?

>>> CREATE EVENT TRIGGER name ON event_name (event_subtype_name [, ...])
>>> EXECUTE PROCEDURE function_name(args);
>>
>> I would amend that syntax to allow for a WHEN <expr> much like in the
>> DML trigger case, where the expression can play with the variables
>> exposed at the time of calling the trigger.
>>
>>  create event trigger prohibit_some_ddl
>>     preceding <timing spec>
>>          when tag in ('CREATE TABLE', 'ALTER TABLE')
>>       execute procedure throw_an_error();
>
> We could do it that way, but the tag in ('CREATE TABLE', 'ALTER
> TABLE') syntax will be tricky to evaluate. I'd really prefer NOT to
> need to start up and shut down the executor to figure out whether the
> trigger has to fire. If we are going to do it that way then we need
> to carefully define what variables are available and what values they
> get. I think that most people's event-filtering needs will be simple
> enough to be served by a more declarative syntax.

We could then maybe restrict that idea so that the syntax is more like

when <trigger variable> in (<literal>[, ...])

So that we just have to store the array of strings and support only one
operation there. We might want to go as far as special casing the object
id to store an oid vector there rather than a text array, but we could
also decide not to support per-oid command triggers in the first
release and remove that from the list of accepted <trigger variable>s.

>> I also don't like the way you squeeze in the drop cascade support here
>> by re qualifying it as a timing spec, which it's clearly not.
>>
>> On subcommand_start, what is the tag set to? the main command or the
>> subcommand? if the former, where do you find the current subcommand tag?
>> I don't think subcommand_start should be a timing spec either.
>
> That's up for grabs, but I was thinking the subcommand. For example, consider:
>
> alter table t alter column a add column b int, disable trigger all;
>
> I would imagine this having a firing sequence something like this:
>
> ddl_command_start:alter_table
> ddl_command_permissions_check:alter_table
> ddl_command_name_lookup:alter_table
> alter_table_subcommand_prep:add_column
> alter_table_subcommand_prep:disable_trigger_all
> alter_table_subcommand_start:add_column
> sql_create:column
> alter_table_subcommand_end:add_column
> alter_table_subcommand_start:disable_trigger_all
> alter_table_subcommand_end:disable_trigger_all
> ddl_command_end:alter_table
>
> Lots of room for argument there, of course.

Yeah well, in my mind the alter table actions are not subcommands yet
because they don't go back to standard_ProcessUtility(). The commands in
CREATE SCHEMA do that, same for create table foo(id serial primary key)
and its sequence and index.

We could maybe add another variable called "context" in the command
trigger procedure API that would generally be NULL and would be set to
the main command tag if we're running a subcommand.

We could still support alter table commands as sub commands as far as
the event triggers are concerned, and document that you don't get a
parse tree there even when implementing the trigger in C.

> Yeah, I would not try to pass the parse tree to every event trigger,
> just the ones where it's well-defined. For drops, I would just say,
> hey, this is what we dropped. The user can log it or ereport, but
> that's about it. Still, that's clearly enough to do a lot of really
> interesting stuff, replication being the most obvious example.

Yes the use case is important enough to want to support it. Now, to
square it into the design. I think it should fire as a “normal” event
trigger, with the context set to e.g. DROP TYPE, the tag set to this
object classid (e.g. DROP FUNCTION).

It's easy enough to implement given that CreateCommandTag() already
switch (((DropStmt *) parsetree)->removeType), we just need to move that
code in another function and reuse it at the right places. It's only
missing DROP COLUMN apparently, I would have to add that.

So you would know that the trigger is firing for a DROP FUNCTION that
happens as part of a DROP TYPE, and I guess you can deduce it's
happening because of a CASCADE behavior, right?

If there's room for doubt we need to expose a boolean variable named
“behavior” that would clarify that doubt situation, maybe it's just too
late but I don't see a need for that.

>> Lastly, EVENT opens the road to thinking about transaction event
>> triggers, on begin|commit|rollback etc. Do we want to take that road?
>
> I wouldn't rule it out. If we can make it work, I think it'd make a
> lot of people happy. Not all combinations are practical, in all
> likelihood, like I don't think you can have a trigger that fires
> before you begin the transaction, because you need an error-handling
> context before you can do anything. But you might be able to have
> post-begin and pre-commit triggers.

Good. Not this release, though, of course.

> Fair point. My main motivation is really that I'd like to have a
> syntax that doesn't presume very much about what sorts of firing
> points we might have. For many purposes, it makes sense to key on

IIUC the technique we're using in the EXPLAIN options case, it's easy to
stay flexible as long as we're providing only single words here, or at
least a simple “parser unit” thing. Hence the use of underscores in the
timing spec names.

>> So we seem to have:
>>
>>  1. start
>>  2. permissions_check
>>  3. consistency_check
>>  4. acquire_lock
>>  5. execute, for lack of a better term
>>  6. end
>>
>> The main problem with that scheme is that while it works fine for simple
>> commands, complex one will have to do lookup and take locks in the
>> middle of the permission and consistency checks, and I don't think we're
>> going to be able to change that.
>
> I think that you've got that a bit mixed up: consistency checks
> typically happen after acquiring locks, not before, because until
> you've locked any consistency checks you do are so much rubbish, as
> the situation can change under you at any time. Also, ALL commands

Ah right.

> need to do lookups and take locks in the middle of permissions checks;
> any that don't are buggy. SOME commands also need to do such things
> during execution, as in the case of ALTER TABLE .. ADD FOREIGN KEY,
> which locks and checks the main table first and then eventually gets
> around to checking the other table.

The namespace is often resolved first, sometime even before permission
checks are done. See CreateConversionCommand() for an example of that,
but really that happens about everywhere IIRC.

>> One possible answer is to say that for some commands, they all are in
>> fact the same event and your event triggers will in fact get run from
>> the same place in the command implementation, but still in that order.
>
> Yes, I think there may be cases where it works out that way. Or
> possibly some events aren't defined for some command types at all, or
> a little of both. I think we can and should do whatever makes the
> behavior least surprising from a user perspective.

Good, agreed. That will make it into a nice documentation table.

>> So we can't guarantee at large that a consistency_check trigger is
>> always run before we did name lookups and acquired locks.
>
> Is this a typo, can't for can? If not, I'm confused, because it seems
> like you were saying we could and should guarantee that, and I agree
> that we can and should.

We don't share the same view on the current state of the code, I don't
think we can guarantee that even if you think we should, unfortunately.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-03-29 20:57:37 Re: query cache
Previous Message Robert Haas 2012-03-29 20:18:36 Re: query cache