Re: Command Triggers patch v18

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers patch v18
Date: 2012-03-27 03:18:55
Message-ID: CA+TgmoZVW1Ui+-NdOwLMfPbXLd6j0CAN8VRigVxMSpnrQkSc-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 26, 2012 at 9:11 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> [ various trivial issues ]

OK, now I got that out of my system. Now on to bigger topics.

I am extremely concerned about the way in which this patch arranges to
invoke command triggers. You've got call sites spattered all over the
place, and I think that's going to be, basically, an unfixable mess
and a breeding ground for bugs. On a first read-through:

1. BEFORE ALTER TABLE triggers are fired in AlterTable(). However, an
ALTER TABLE statement does not necessarily call AlterTable() once and
only once. The real top-level logic for AlterTable is in
ProcessUtility(), which runs transformAlterTableStmt() to generate a
list of commands and then either calls AlterTable() or recursively
invokes ProcessUtility() for each one. This means that if IF EXISTS
is used and the table does not exist, then BEFORE command triggers
won't get invoked at all. On the other hand, if multiple commands are
specified, then I think AlterTable() may get invoked either once or
more than once, depending on exactly which commands are specified; and
we might manage to fire some CREATE INDEX command triggers or whatnot
as well, again depending on exactly what that ALTER TABLE command is
doing.

2. BEFORE CREATE TABLE triggers seem to have similar issues; see
transformCreateStmt.

rhaas=# create table foo (a serial primary key);
NOTICE: CREATE TABLE will create implicit sequence "foo_a_seq" for
serial column "foo.a"
NOTICE: snitch: BEFORE CREATE SEQUENCE public.foo_a_seq [<NULL>]
NOTICE: snitch: AFTER CREATE SEQUENCE public.foo_a_seq [16392]
NOTICE: snitch: BEFORE CREATE TABLE public.foo [<NULL>]
NOTICE: snitch: BEFORE CREATE INDEX public.foo_pkey [<NULL>]
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"foo_pkey" for table "foo"
NOTICE: snitch: AFTER CREATE INDEX public.foo_pkey [16398]
NOTICE: snitch: BEFORE ALTER SEQUENCE public.foo_a_seq [16392]
NOTICE: snitch: AFTER ALTER SEQUENCE public.foo_a_seq [16392]
NOTICE: snitch: AFTER CREATE TABLE public.foo [16394]
CREATE TABLE

3. RemoveRelations() and RemoveObjects() similarly take the position
that when the object does not exist, command triggers need not fire.
This seems entirely arbitrary. CREATE EXTENSION IF NOT EXISTS,
however, takes the opposite (and probably correct) position that even
if we decide not to create the extension, we should still fire command
triggers. In a similar vein, AlterFunctionOwner_oid() skips firing
the command triggers if the old and new owners happen to be the same,
but other forms of ALTER FUNCTION (e.g. ALTER FUNCTION .. COST) fire
triggers even if the old and new parameters are the same; and
AlterForeignDataWrapperOwner_internal does NOT skip firing command
triggers just because the old and new owners are the same.

4. RemoveRelations() and RemoveObjects() also take the position that a
statement like "DROP TABLE foo, bar" should fire each relevant BEFORE
command trigger twice, then drop both objects, then fire each relevant
AFTER command trigger twice. I think that's wrong. It's 100% clear
that the user executed one, and only one, command. The only real
argument for it is that if we were to only fire command triggers once,
we wouldn't be able to pass identifying information about the objects
being dropped, since the argument-passing mechanism only has room to
pass details about a single object. I think that means that the
argument-passing mechanism needs improvement, not that we should
redefine one command as two commands. Something like CREATE SCHEMA
foo CREATE VIEW bar ... has the same problem: the user only entered
one command, but we're firing command triggers as if they entered
multiple commands. This case is possibly more defensible than the
other one, but note that the two aren't consistent with each other as
regards the order of trigger firing, and I actually think that they're
both wrong, and that only the toplevel command should fire triggers.

5. It seems desirable for BEFORE command triggers to fire at a
consistent point during command execution, but currently they don't.
For example, BEFORE DROP VIEW triggers don't fire until we've verified
that q exists, is a view, and that we have permission to drop it, but
LOAD triggers fire much earlier, before we've really checked anything
at all. And ALTER TABLE is somewhere in the middle: we fire the
BEFORE trigger after checking permissions on the main table, but
before all permissions checks are done, viz:

rhaas=> alter table p add foreign key (a) references p2 (a);
NOTICE: snitch: BEFORE ALTER TABLE public.p [16418]
ERROR: permission denied for relation p2

6. Another pretty hideous aspect of the CREATE TABLE behavior is that
AFTER triggers are fired from a completely different place than BEFORE
triggers. It's not this patch's fault that our CREATE TABLE and
ALTER TABLE code is a Rube Goldberg machine, but finding some place to
jam each trigger invocation in that happens to (mostly) work as the
code exists today is not sufficiently future-proof.

I am not too sure the above list is exhaustive. I think that what's
leading you down the garden path here is the desire to pass as much
information as possible to each command trigger. But for that, we
could simply put the hooks in ProcessUtility(), adding perhaps a bit
of logic to deal with recursive invocations of ProcessUtility() to
handle subcommands, and call it good. In fact, I think that for AFTER
triggers, you could make that work anyway - every time you enter
ProcessUtility(), you push something onto a global stack, and then
individual commands annotate that data structure with details such as
the OIDs of objects being operated on, and after processing the
command we fire the after triggers from ProcessUtility(). That way,
no matter how anybody changes the code in the future, we're guaranteed
that AFTER triggers will always fire exactly once per command, and the
worst thing that happens if someone fails to make the right changes in
some other part of the code is that the AFTER trigger doesn't get all
the command details, and even that seems less likely since a lot of
the early exit paths won't have those details to supply anyway.

BEFORE triggers are a lot harder, because there you want to gather a
certain amount of information and then fire the trigger. There's more
than a little bit of tension there. The longer you delay firing the
command trigger, the more information you can reliably supply - but on
the other hand, the more chance that things will error out before we
even reach the command trigger firing site. I think this is going to
cause problems in the future as people want to supply more details to
the command trigger - people who want more details to, say,
selectively deny commands will want to fire the command trigger later
in the execution sequence, but people who want to use it for auditing
will want to fire it earlier in the sequence, to log all attempts. I
don't know how to resolve this tension, but it seems to me that we'd
better try to pick a rule and follow it consistently; and we'd better
make sure that the rule is clear, documented, and well-defined.

More nitpicking:

- Creating a command trigger on a nonexistent function fails with
"cache lookup failed for function 0".
- tablespace.c has useless leftover changes.
- One of the examples in the docs says "RETURNS command trigger"
instead of "RETURNS command_trigger".
- The "DROP COMMAND TRIGGER" documentation contains leftovers that
refer to a parameter called "command" which no longer exists.
- The example in the "DROP COMMAND TRIGGER" documentation fails to
execute, as the syntax is no longer recognized.
- check_cmdtrigger_name is mostly duplicative of get_cmdtrigger_oid.
I think it can be rewritten as if (OidIsValid(get_cmdtrigger_oid(...))
ereport(...).
- I believe that all the changes to dropdb() can be reverted.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2012-03-27 04:03:39 Re: 9.2 commitfest closure (was Command Triggers, v16)
Previous Message limaozeng 2012-03-27 02:58:39 Can param evaluation use same estate as executor?