Re: Command Triggers patch v18

From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers patch v18
Date: 2012-03-29 16:17:43
Message-ID: m2k423z8ig.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>     create command trigger before COMMAND_STEP of alter table
>>          execute procedure snitch();
>
> One thought is that it might be better to say AT or ON or WHEN rather
> than BEFORE, since BEFORE END is just a little strange; and also
> because a future hook point might be something like
> "permissions-checking", and we really want it to be AT
> permissions-checking time, not BEFORE permissions-checking time.

Yeah I tried different spellings and almost sent a version using AT or
WHEN, but it appeared to me not to be specific enough: AT permission
checking time does not exist, it either happens before or after
permission checking. I played with using “preceding” rather than before,
too, maybe you would like that better.

So I would document the different steps and their ordering, then use
only BEFORE as a qualifier, and add a pseudo step which is the end of
the command.

> If I got to pick, I'd pick this syntax:
>
> CREATE EVENT TRIGGER name ON event_name (event_subtype_name [, ...])
> EXECUTE PROCEDURE function_name(args);
>
> Then you could eventually imagine:
>
> CREATE EVENT TRIGGER prohibit_all_ddl ON ddl_command_start EXECUTE
> PROCEDURE throw_an_error();
> CREATE EVENT TRIGGER prohibit_some_ddl ON ddl_command_start
> (alter_table) EXECUTE PROCEDURE throw_an_error();
> CREATE EVENT TRIGGER replicate_table_drops ON sql_drop (table) EXECUTE
> PROCEDURE record_table_drop();
> CREATE EVENT TRIGGER allow_whitelisted_extensions ON permissions_check
> (create_extension) EXECUTE PROCEDURE make_heroku_happy();
> CREATE EVENT TRIGGER replicate_at_subcommands ON subcommand_start
> (alter_table) EXECUTE PROCEDURE record_alter_table_subcommand();

I would amend that syntax to allow for a WHEN <expr> much like in the
DML trigger case, where the expression can play with the variables
exposed at the time of calling the trigger.

create event trigger prohibit_some_ddl
preceding <timing spec>
when tag in ('CREATE TABLE', 'ALTER TABLE')
execute procedure throw_an_error();

We must also take some time to describe the timing specs carefully,
their order of operation and which information are available for
triggers in each of them. See below.

I also don't like the way you squeeze in the drop cascade support here
by re qualifying it as a timing spec, which it's clearly not.

On subcommand_start, what is the tag set to? the main command or the
subcommand? if the former, where do you find the current subcommand tag?
I don't think subcommand_start should be a timing spec either.

> Piece by piece:
>
> - I prefer EVENT to COMMAND, because the start or end or any other
> point during the execution of a command can be viewed as an event;
> however, it is pretty clear that not everything that can be viewed as
> an event upon which we might like triggers to fire can be viewed as a
> command, even if we have a somewhat well-defined notion of
> subcommands. I continue to think that there is no sense in which
> creating a sequence to back a serial column or dropping an object due
> to DROP .. CASCADE is a subcommand; it is, however, pretty clearly an
> event. Anything we want can be an event. I think if we don't get
> this right in the first version we're going to be stuck with a
> misnomer forever. :-(

I can buy naming what we're building here EVENT TRIGGERs, albeit with
some caveats: what I implemented here only caters for commands for which
we have a distinct parse tree. That's what sub commands are in effect
and why it supports create schema sub commands but not all alter table
declination, and not cascaded drops.

Also, implementing cascade operations as calling process utility with a
complete parse tree is not going to be a little project on its own. We
can also punt that and document that triggers fired on cascaded drops
are not receiving a parse tree, only the command tag and object id and
object name and schema name, or even just the command tag and object id
in order not to incur too many additional name lookups.

Lastly, EVENT opens the road to thinking about transaction event
triggers, on begin|commit|rollback etc. Do we want to take that road?

> - It is pretty clear that for triggers that are supposed to fire at
> the start or end of a command, it's useful to be able to specify which
> commands it fires for. However, there must be other types of events
> (as in my postulated sql_drop case above) where the name of the
> command is not relevant - people aren't going to find all the objects
> that got dropped as a result of a toplevel drop table command; they're
> going to want to find all the tables that got dropped regardless of
> which incantation did it. Also keep in mind that you can do things
> like use ALTER TABLE to rename a view (and there are other cases of
> that sort of confusion with domains vs. types, aggregates vs.
> functions, etc), so being able to filter based on object type seems
> clearly better in a bunch of real-world cases. And, even if you don't
> believe that specific example, a general syntax leaves us with freedom
> to maneuver if we discover other needs in the future. Writing
> alter_table etc. as tokens rather than as ALTER TABLE also has the
> further advantages of (1) greatly decreasing the parser footprint of
> this feature and (2) relieving everyone who adds commands in the
> future of the need to also change the command-trigger grammar.

Well that's a very little burden considering that you will need to add
support for firing the triggers at the right time with the right piece
of information provided, but why not. I'm also not convinced that the
complexity of implementing proper error management is worth simplifying
the parser too much.

I still think that maybe we should bite the bullet and have two separate
parsers, one dedicated for ExplainableStmt and transaction control and
such, and one for DDL operations. It seems to me the trade offs here are
not the same at all: the performance versus user friendliness ratio is
very different. Of course, that's a whole different project now and
won't happen for 9.3, nor is it a pre-requisite for this patch.
Fortunately.

> - I don't think that triggers at what Dimitri is calling "before
> execute" are a good idea at all for the first version of this feature,
> because there is a lot of stuff that might break and examining that
> should really be a separate project from getting the basic
> infrastructure in place. However, I also don't like the name. Unlike

I smell confusion, I meant for "before execute" command triggers to be
the one I've already implemented in the current patch as "BEFORE
COMMAND" triggers. That is they fire after permission checks, error
checks and name lookups and locking.

> queries, DDL commands don't have clearly separate planning and
> execution phases, so saying that something happens before execution
> isn't really very clear. As previously discussed, the hook points in

The reason for that is that I still was trying to reuse the current set
of keywords rather than already trying to use another more flexible
mechanism. That's about it.

> the latest patch are not all entirely consistent between one command
> and the next, not only in the larger ways I've already pointed out but
> also in some smaller ways. Different commands do different kinds of

I had not set a solid policy here yet, as I figured out that whatever I
would decide to implement would get thrown away at this stage of the
review.

> integrity checks and the firing points are not always in exactly the
> same place in that sequence. Of course, not all the commands do all
> the checks in the same order, so some possibly-not-trivial refactoring
> is probably required to get that consistent. Moreover, it doesn't

I wouldn't be surprised that we have some adjusting to do here, but the
current code on this light is clearly the result of copy/pastes. So it's
quite ok even if not exactly on purpose.

> seem out of reach to think that in the future we might draw a more
> clear line again between pre-execution and execution proper, to solve
> problems like the tendency of the current code to look up the same
> name more than once and pray that it gets the same OID back every
> time. So I think that when we get to the point of sticking hooks in
> at this location, we should call them something much more specific
> than before-execute hooks. I'm not sure exactly what. Maybe

I think we should describe the steps we would like to have now already
then filter out those that are out of scope for 9.2 and implement the
others. We would only document the implemented steps, I guess.

So we seem to have:

1. start
2. permissions_check
3. consistency_check
4. acquire_lock
5. execute, for lack of a better term
6. end

The main problem with that scheme is that while it works fine for simple
commands, complex one will have to do lookup and take locks in the
middle of the permission and consistency checks, and I don't think we're
going to be able to change that.

One possible answer is to say that for some commands, they all are in
fact the same event and your event triggers will in fact get run from
the same place in the command implementation, but still in that order.

So we can't guarantee at large that a consistency_check trigger is
always run before we did name lookups and acquired locks.

> something like after_ddl_name_lookup or something like that. The DDL
> name lookup might move slightly earlier or later as a result of
> refactoring, but it can't disappear and it can't cease to involve
> locking and permissions checking. The set of things that you can
> reasonably do there (additional security checks, auditing) can't
> really change much either.

Exactly. In theory at least :)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-03-29 16:28:24 Re: Command Triggers patch v18
Previous Message Tom Lane 2012-03-29 16:12:10 Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)