Quick Links

Re: Event Triggers reduced, v1

From:	Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Event Triggers reduced, v1
Date:	2012-07-03 09:13:11
Message-ID:	m2zk7hnr6g.fsf@2ndQuadrant.fr
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Given what I foresee, simply having another columns in there named
>> evtstags with the exact same content as evttags would be the simplest
>> and most natural implementation, really.
>
> That seems a lot less general for no particular gain.

The gain is code, docs and usage simplification. I think going general
here is going to confuse every one involved and that we should have two
targets in mind: classic use cases that we want to address easily enough
in SQL and with some PLpgSQL help, and advanced use cases that are
possible to implement in PL/C using the parse tree and the soon to come
back rewritten command string.

IOW, let's make the simple things simple and the complex one possible.

The following is quite long an email where I try to give plenty of
examples and to detail the logic I'm working with so that you can easily
stab at whichever part you're thinking is not going to fly.

>> I don't foresee more generic needs here, unless you can convince me that
>> we need both a. a full stack of arbitrarily nested commands and b. a way
>> to match and target any level of that stack.
>
> Well, let's take the example of an ALTER TABLE command. You could
> want to match on:
>
> - the type of object the user mentioned in the command (did he write
> ALTER TABLE or ALTER VIEW?)
> - the type of object actually being modified (could differ if he used

I don't think it's possible to implement that without shaking all the
system, after having a look at the following lines from gram.y:

ALTER VIEW qualified_name alter_table_cmds
{
AlterTableStmt *n = makeNode(AlterTableStmt);

So, the way to implement that need from an event trigger is to use the
parse tree, and hopefully soon enough the rewritten command string.

> ALTER TABLE on a view, or visca versa)
> - the particular ALTER TABLE subcommand in use (e.g. SET STATISTICS)

Now we can publish that, we would have some events with

tag = 'ALTER TABLE'

then some others with

toplevel = 'ALTER TABLE'
tag = 'SET STATISTICS'

The same idea would need to get implemented for serial, where the tag is
'CREATE SEQUENCE' and the toplevel tag is 'CREATE TABLE'. That allows to
easily install an event trigger that gets called every time a sequence
is created, you can then have a look at the toplevel command tag if you
need to.

CREATE EVENT TRIGGER snitch_seqs
ON command_start
WHEN tag IN ('CREATE SEQUENCE')
EXECUTE PROCEDURE snitch_seqs();

The idea is that the function snitch_seqs has a "magic" variable
TG_TOPLEVEL that can be tested and will be set to 'CREATE TABLE' when
we're dealing with a SERIAL, in that example.

If you want your event trigger to only ever deal with SERIAL, you could
install it this way:

CREATE EVENT TRIGGER my_serial_trigger
ON command_start
WHEN toplevel IN ('CREATE TABLE')
AND tag IN ('CREATE SEQUENCE')
EXECUTE PROCEDURE handle_serial_trigger();

Now let's see about getting more generic than that.

We also can get tag = 'CREATE INDEX' and toplevel = 'ALTER TABLE' when
adding a primary key for example. That's an example that can lead us to
more than 2 levels of nested tags, which I would want to avoid. The
stack here would look like:

1. ALTER TABLE
2. ADD PRIMARY KEY
3. CREATE INDEX

I think only having 1 and 3 is enough, for more details the command
string and the parse tree are available. In the main use case of
replication, you mostly just want to replicate the command string. You
might want to apply some transformation rules to it (table names, cope
with a different target schema, etc) but typically those rules are to be
run in the subscriber system, not on the provider (picture a federating
system where each provider uses the same schema, that gets mapped to a
schema per provider on the subscriber).

The other problem with the stack of tags is matching them. I don't see
that it helps writing event triggers much. In the previous example, if
you want an event trigger that fires on each ALTER TABLE, you don't know
which level of the stack to target. Either you have to target the
current tag or the toplevel tag or something in between. We could easily
have the following tag stack:

1. CREATE SCHEMA
2. ALTER TABLE
3. ADD PRIMARY KEY
4. CREATE INDEX

So now we need a way to target any entry in the stack and a way to
represent the stack in every PL language we have, and an easy way to
analyze the stack. For PLpgSQL I guess that means we want to expose this
tag stack as a TABLE, and the complexity just went off the table.

My view point is that for any practical case I can think about what we
care about is the current command being run, and given how PostgreSQL is
made today that means handling one level of sub commands. That addresses
ALTER TABLE and also DROP CASCADE.

I don't think adding-in an ALTER TABLE that never happened in the middle
of those two elements is going to make life easier for anybody involved,
quite the contrary:

1. DROP TYPE
2. DROP COLUMN

Users that need that level of detail for their processing are welcome to
code their Event Trigger in PL/C and analyze the parse tree. We can call
that advanced analysis.

> I suspect there are other examples as well. If we pick the 2-D list
> representation I suggested, or something like it, we can easily
> accommodate these kinds of filters without having to whack the catalog
> representation around any further. That seems pretty appealing.

The generic approach leads us to invent a stack of tags and (I suspect)
a DSL for tag matching where you can express at least those different
things:

- this tag is found in the stack (tag <@ stack)
- this other tag is found higher in the stack
- this other tag is found just one level higher in the stack
- this other tag is found at least 2 levels higher in the stack
- this third tag is found lower in the stack
- this third tag is found just one level lower in the stack
- and maybe some more

Those semantics are going to be needed, either in the event trigger
definition itself, or in the event trigger code. We could expose a stack
of tags and ditch the WHEN clause here, but we still have to be able to
implement the filtering in PLpgSQL for simple cases.

If we're not able to express such detailed semantics I don't think we're
servicing users by making things way more complex for them to use. The
drawback is that we will have to make choices as to which tag we expose
exactly, remembering that all the details are to be found in the parse
tree and the rewritten command string.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

In response to

Re: Event Triggers reduced, v1 at 2012-07-02 21:23:02 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Kapila	2012-07-03 10:43:44	Re: xlog filename formatting functions in recovery
Previous Message	Kyotaro HORIGUCHI	2012-07-03 08:59:38	Re: pl/perl and utf-8 in sql_ascii databases