Re: Event Triggers: adding information

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Steve Singer <ssinger(at)ca(dot)afilias(dot)info>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Event Triggers: adding information
Date: 2013-01-25 15:42:24
Message-ID: CA+TgmoZz6MXQ5zX6dopc_xaHVkdwhEhgDFJeAWsRNs+N7e_ueA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 24, 2013 at 5:43 AM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
>> But I wonder: wouldn't it be better to just expose the raw string the
>> user typed? I mean, in all the cases we really care about, the
>> information we can *reliably* expose at this point is going to be
>> pretty nearly the same as extracting the third space-delimited word
>> out of the command text and splitting it on a period. And you can do
>> that much easily even in PL/pgsql. You could also extract a whole lot
>
> Totally Agreed. That's another reason why I want to provide users with
> the Normalized Command String, it will be then be even easier for them
> to do what you just say.

OK, but can we lay the issue of a *normalized* command string to the
side for just one minute, and talk about exposing the *raw* command
string? It seems to me that this would be (1) very easy to do, (2)
reasonable to slip into 9.3, and (3) useful to some people. Arguably,
the normalized command string would be useful to MORE people, but I
think the raw command string is not without its uses, and it's at
least one order of magnitude less work.

>> Now, in a ddl_command_end trigger, there's a lot more information that
>> you can usefully expose. In theory, if the command records "what it
>> did" somewhere, you can extract that information with as much
>> precision as it cared to record. However, I'm pretty skeptical of the
>> idea of exposing a single OID. I mean, if the "main" use case here is
>
> There's precedent in PostgreSQL: how do you get information about each
> row that were in the target from a FOR EACH STATEMENT trigger?

You don't. But, the fact that you can't is an unpleasant limitation,
not a model of emulation.

We make trade-offs about the scope of features all the time. Somebody
evidently thought it was reasonable for each-row triggers to expose
the tuple but to leave the corresponding feature for each-statement
triggers unimplemented, and as much as I'd like to have that feature,
I agree that was a reasonable trade-off. But it would be quite
something else again if someone were to propose the idea of having NEW
and OLD available for each-statement triggers, but only give the
information about the first row affected by the statement, rather than
all of them. I think that proposal would quite rightly be rejected,
and I think it's morally equivalent to what you're proposing, or what
I understood you to be proposing, at any rate.

>> array of OIDs. Really, you're going to need something quite a bit
>> more complicated than that to describe something like ALTER TABLE, but
>> even for pretty simple cases a single OID doesn't cut it.
>
> If you want to solve that problem, let's talk about pseudo relations
> that you can SELECT FROM, please. We can already expose a tuplestore in
> PL code by means of cursors and SRF, but I'm not sure that's how we want
> to expose the "statement level information" here.

I'm not sure, either, but I think that exposing things as tables is a
neat idea. I had imagined passing either JSON or arrays, but tables
would be easier to work with, at least for PL/pgsql triggers.

> The only commands that will target more than one object are:
>
> - DROP, either in the command or by means of CASCADE;
> - CREATE SCHEMA with its PROCESS_UTILITY_SUBCOMMAND usage;

CREATE TABLE can also create subsidiary objects (indexes,
constraints); and there could be more things that do this in the
future.

> The CASCADE case is something else entirely, and we need to be very
> careful about its locking behavior. Also, in the CASCADE case I can
> perfectly agree that we're not talking about a ddl_something event any
> more, and that in 9.4 we will be able to provide a sql_drop generic
> event that will now about that.

I've felt from the beginning that we really need that to be able to do
anything worthwhile with this feature, and I said so last year around
this time. Meanwhile, absent that, if we put something in here that
only handles a subset of cases, the result will be DDL replication
that randomly breaks. Bill Gates used to have a reputation for being
able to give demos of beta software where he carefully avoided doing
things that would trigger the known crash bugs, and thus make it look
like everything was working great. That's fine for a marketing
presentation, but releasing in that kind of state gets you a bad
reputation.

> Mostly agreed. Do we want to ship with only PL/pgSQL support? I've not
> been following how those PL features usually reach the other PLs…

That doesn't bother me. If the facility is useful enough that people
want it in other PLs, they can submit patches to add it. I don't
believe there's any requirement that we must support this feature in
every PL before we can think about releasing.

> Well, try and get the name of the exclusion constraint created when
> running that command, then rewrite it with the name injected in the
> command:
>
> CREATE TABLE xyz(i int, exclude (i WITH =) where (i > 10) deferrable);

Oh. Ouch.

> Another case is column DEFAULT values, but I don't think we care about
> the constraint name in that case. It's possible to inject the column and
> table level CHECK constraints names in the CREATE TABLE syntax, though.

I don't think column DEFAULTs create constraints, do they?

>> way. For example, if we add a function that tells you what object
>> name the user typed, it doesn't need to expose the fact that the parse
>> tree names that field differently in different DDL constructs, which
>> certainly increases our odds of not going insane after a release or
>> two of maintaining this code.
>
> That's called tg_objectname and tg_schemaname in my proposal. Can you
> come up with either a design proposal of an "external" stable
> representation of the parse tree, or a more complete list of information
> you want exposed?

I'm not sure I can, but I think that the design I'm proposing gives a
lot of flexibility. If we create a pg_parse_tree type with no input
function and an output function that merely returns a dummy value, and
we pass a pg_parse_tree object to PL/pgsql event triggers, then it's a
small win even if it offers no accessors at all - because somebody
could put whatever accessor functions they want in a contrib module
without waiting for 9.4.

If we do offer some accessors (which I assume we would), then it's
even better. You can get at the information exposed by those
accessors, and if you want more, you don't need to wait for a new PG
release, you can add 'em via a home-grown contrib module, again,
without waiting for 9.4.

The point is - we wouldn't expose the whole parse tree. Rather, we'd
expose a parse-tree as an opaque object, and give the user functions
that would expose specific bits of it. Those bits could grow over
time without needing any changes to PLs or existing trigger functions,
and they wouldn't change if the parse tree internals changed, and
users could extend the amount of information exposed via contrib
functions (at their own risk, of course).

>>> Extra bonus items, a table_rewrite event. Unlikely for 9.3 at best,
>>> unless someone else wants to have at it, maybe?
>>
>> I'd like to take a crack at adding support for some kind of really
>> cool event for 9.4 - I suspect this one is not what I'd pick as a
>> first target, just because it seems hard. But I'll mull it over.
>
> I'll hand you another one then: have INSERT INTO create the destination
> TABLE when it does not exists, for RAD users out there. You know, those
> schema less NoSQL guys…

Wow, you don't think small. I'm not sure how hard that would be, but
I agree it would be cool. I think the trigger might have to be
written in C to get the behavior people would likely want without an
unreasonable amount of screwing around.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-01-25 15:44:31 Re: [PATCH 1/3] Fix x + y < x overflow checks
Previous Message Andrew Dunstan 2013-01-25 15:39:45 Re: BUG #6510: A simple prompt is displayed using wrong charset