Re: Command Triggers

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-12 16:49:59
Message-ID: 201112121750.00031.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Monday, December 12, 2011 05:32:45 PM Robert Haas wrote:
> On Sun, Dec 11, 2011 at 1:55 PM, Dimitri Fontaine
>
> <dimitri(at)2ndquadrant(dot)fr> wrote:
> > Andres Freund <andres(at)anarazel(dot)de> writes:
> >> Hm. I just noticed a relatively big hole in the patch: The handling of
> >> deletion of dependent objects currently is nonexistant because they
> >> don't go through ProcessUtility...
> >
> > That's the reason why we're talking about “command triggers” rather than
> > “DDL triggers”. We don't intend to fire the triggers at each DDL
> > operation happening on the server, but for each command.
> >
> > This restriction still allows us to provide a very useful feature when
> > checked against the main use cases we target here. Those are auditing,
> > and replication (the replay will also CASCADEs), and a generic enough
> > SUDO facility (because the trigger function can well be SECURITY
> > DEFINER).
>
> I haven't yet thought about your specific proposal here in enough to
> have a fully-formed opinion, but I am a little nervous that this may
> turn out to be one of those cases where the obvious API ends up
> working less well than might have been supposed.
What are your thoughts about a "not-obvious api"?

> For example,
> consider the "auditing" use case, and suppose that the user wants to
> log a message (to a table, or some other separate logging facility)
> every time someone creates an index. In order to do that, they're
> going to need to trap not only CREATE INDEX but also CREATE TABLE and
> ALTER CONSTRAINT, and in the latter two cases they're then going to
> have to then write some grotty logic to grovel through the statement
> or its node tree in order to figure out what indexes are being
> created. Also, if they want to log the name of the new index in cases
> where the user doesn't specify it, they're going to have to duplicate
> the backend logic which generates the index name, which is going to be
> a pain in the ass and a recipe for future bugs (e.g. because we might
> change the algorithm at some point, or the trigger might see a
> different view of the world than the actual command, due to
> intervening DDL).
I thought about using transformed statements before passing them to the
triggers to avoid exactly that issue. That would make stuff like CREATE TABLE
blub (LIKE), constraints and such transparent. It would mean quite some
additional effort though (because for several statements the transformation is
not really separated).

> This is something that has come up a lot in the context of sepgsql:
> it's not really enough to know what the toplevel command is in some
> cases, because whether or not the operation is allowed depends on
> exactly what it's doing, which depends on things like which schema
> gets used to create the table, and which tablespaces get used to
> create the table and its indexes, and perhaps (given Peter's pending
> work) what types are involved. You can deduce all of these things
> from the command text, but it requires duplicating nontrivial amounts
> of backend code, which is undesirable from a maintainability point of
> view. I think the same issues are likely to arise in the context of
> auditing, as in the example above, and even for replication: if, for
> example, the index name that is chosen on the master happens to exist
> on the standby with a different definition, replaying the actual
> command text might succeed or fail depending on how the command is
> phrased. Maybe it's OK for the slave to just choose a different name
> for that index, but now a subsequent DROP INDEX command that gets
> replayed across all servers might end up dropping logically unrelated
> indexes on different machines.
Well, you shouldn't grovel through the text - you do get passed the parsetree
which should make accessing that far easier. I can imagine somebody writing
some shiny layer which lets you grovel trough that with some xpath alike api.

> Now, on the other hand, the conceptual simplicity of this approach is
> very appealing, and I can certainly see people who would never
> consider writing a ProcessUtility_hook using this type of facility.
> However, I'm worried that it will be difficult to use as a foundation
> for robust, general-purpose tools. Is that enough reason to reject
> the whole approach? I'm not sure.
Thats where I am a bit unsure as well. I think the basic approach is sound but
it might need some work in several areas (transformed statements,
dependencies, ...). On the other hand: perfect is the enemy of good...

Having looked at it I don't think dependency handling is that much effort.
Havent looked at separation of the transformation phase.

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kohei KaiGai 2011-12-12 17:00:12 Re: [v9.2] Fix Leaky View Problem
Previous Message Robert Haas 2011-12-12 16:48:07 Re: patch : Allow toast tables to be moved to a different tablespace