Re: Command Triggers patch v18

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers patch v18
Date: 2012-03-27 13:37:18
Message-ID: 201203271537.18651.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, March 27, 2012 02:55:47 PM Robert Haas wrote:
> On Tue, Mar 27, 2012 at 4:27 AM, Dimitri Fontaine
>
> <dimitri(at)2ndquadrant(dot)fr> wrote:
> > In the first versions of the patch I did try to have a single point
> > where to integrate the feature and that didn't work out. If you want to
> > publish information such as object id, name and schema you need to have
> > specialized code spread out everywhere.
>
> [...]
>
> > That's meant to be a feature of the command trigger system, that's been
> > asked about by a lot of people. Having triggers fire for sub commands is
> > possibly The Right Thing™, or at least the most popular one.
>
> [...]
>
> > I will rework LOAD, and ALTER TABLE too, which is more work as you can
> > imagine, because now we have to insinuate the command trigger calling
> > into each branch of what ALTER TABLE is able to implement.
>
> [...]
>
> > From last year's cluster hacker meeting then several mails on this list,
> > it appears that we don't want to do it the way you propose, which indeed
> > would be simpler to implement.
>
> [...]
>
> > I think that's a feature. You skip calling the commands trigger when you
> > know you won't run the command at all.
>
> I am coming more and more strongly to the conclusion that we're going
> in the wrong direction here. It seems to me that you're spending an
> enormous amount of energy implementing something that goes by the name
> COMMAND TRIGGER when what you really want is an EVENT TRIGGER.
> Apparently, you don't want a callback every time someone types CREATE
> TABLE: you want a callback every time a table gets created.
Not necessarily. One use-case is doing something *before* something happens
like enforcing naming conventions, data types et al during development/schema
migration.

> In the
> interest of getting event triggers, you're arguing that we should
> contort the definition of the work "command" to include sub-commands,
> but not necessarily commands that turn out to be a no-op, and maybe
> things that are sort of like what commands do even though nobody
> actually ever executed a command by that name. I just don't think
> that's a good idea. We either have triggers on commands, and they
> execute once per command, period; or we have triggers on events and
> they execute every time that event happens.
I don't think thats a very helpfull definition. What on earth would you want to
do with plain passing of
CREATE SCHEMA blub CREATE TABLE foo() CREATE TABLE bar();

So some decomposition seems to be necessary. Getting the level right sure
ain't totally easy.
It might be helpful to pass in the context from which something like that
happened.

> As it turns out, two people have already put quite a bit of work into
> designing and implementing what amounts to an event trigger system for
> PostgreSQL: me, and KaiGai Kohei. It's called the ObjectAccessHook
> mechanism, and it fires every time we create or drop an object. It
> passes the type of object created or dropped, the OID of the object,
> and the column number also in the case of a column. The major
> drawback of this mechanism is that you have to write the code you want
> to execute in C, and you can't arrange for it to be executed via a DDL
> command; instead, you have to set a global variable from your shared
> library's _PG_init() function. However, I don't think that would be
> very hard to fix. We could simply replaces the
> InvokeObjectAccessHook() macro with a function that calls a list of
> triggers pulled from the catalog.
Which would basically require including pg_dump in the backend to implement
replication and logging. I don't think librarifying pg_dump is a fair burden
at all.
Also I have a *very hard* time to imagining to sensibly implement replicating
CREATE TABLE or ALTER TABLE ... ADD COLUMN with just object access hooks.
There is just not enough context. How would you discern between a ADD COLUMN
and a CREATE TABLE? They look very similar or even identical on a catalog
level.

I also have the strong feeling that all this would expose implementation
details *at least* as much as command triggers. A slight change in order of
catalog modifcation would be *way* harder to hide via the object hook than
something of a similar scale via command triggers.

> I think there are a couple of advantages of this approach over what
> you've got right now. First, there are a bunch of tested hook points
> that are already committed. They have well-defined semantics that are
> easy to understand: every time we create or drop an object, you get a
> callback with these arguments. Second, KaiGai Kohei is interested in
> adding more hook points in the future to service sepgsql. If the
> command triggers code and the sepgsql code both use the same set of
> hook points then (1) we won't clutter the code with multiple sets of
> hook points and (2) any features that get added for purposes of
> command triggers will also benefit sepgsql, and visca versa. Since
> the two of you are trying to solve very similar problems, it would be
> nice if both of you were pulling in the same direction. Third, and
> most importantly, it seems to match the semantics you want, which is a
> callback every time something *happens* rather than a callback every
> time someone *types something*.
Obviously unifying both on implementation level would be nice, but I don't
really see it happening. The ObjectAccessHook mechanism seems too low level to
me.

> Specifically, I propose the following plan:
>
> - Rename COMMAND TRIGGER to EVENT TRIGGER. Rewrite the documentation
> to say that we're going to fire triggers every time an *event*
> happens. Rewrite the code to put the firing mechanism inside
> InvokeObjectAccessHook, which will become a function rather than a
> macro.
That would mean extending the ObjectAccessHook mechanism to handle:
- altering objects
- before events (which would make the current data passing mechanism moot, you
can't look in the catalogs yet)
- passing names (otherwise simple triggers are *WAY* to hard to write)
- giving enough usefull context to know whats happening. Knowing that a new
row was created is just about pointless in itself if you don't know why/from
where.
- ...

Which would - at least as far I can see - essentially morph into the command
triggers patch.

> I realize this represents a radical design change from what you have
> right now, but what you have right now is messy and ill-defined and I
> don't think you can easily fix it. You're exposing great gobs of
> implementation details which means that, inevitably, every time anyone
> wants to refactor some code, the semantics of command triggers are
> going to change, or else the developer will have to go to great
> lengths to ensure that they don't. I don't think either of those
> things is going to make anyone very happy.
Hm. I agree that there is some work needed to make the whole handling more
consistent. But I don't think its as bad as you paint it.

I don't want to brush of your review here - you definitely do raise valid
points. I don't particularly like the current implementation very much. But I
simply fail to see a less annoying way.

Its also not that I am just defending a colleague - you might have noticed the
@2ndquadrant - I got interested in this patch quite a bit before that was on
the horizon.

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2012-03-27 13:46:07 Re: Odd out of memory problem.
Previous Message Noah Misch 2012-03-27 13:11:38 Re: Cross-backend signals and administration (Was: Re: pg_terminate_backend for same-role)