Re: Event Triggers reduced, v1

From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Event Triggers reduced, v1
Date: 2012-06-29 21:34:37
Message-ID: m2ipe9kdjm.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Here's an answer to your review (thanks!), with no patch attached yet
even if I've been cleanup up most of what you reported. Incremental diff
is available for browsing here:

https://github.com/dimitri/postgres/compare/f99e8d93b7...8da156dc70

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Some of the pg_dump hunks fail to apply for me; I guess this needs to
> be remerged.

Done.

> There are a few remaining references to EVTG_FIRED_BEFORE /
> EVTG_FIRED_INSTEAD_OF which should be cleaned up. I suggest writing
> the grammar as CREATE EVENT TRIGGER name ON event_name EXECUTE... On
> a related note, evttype is still mentioned in the documentation, also.
> And CreateEventTrigStmt has a char timing field that can go.

I didn't get the memo that we had made a decision here :) That said it
will be possible to change our mind and introduce that instead of syntax
if that's necessary later in the cycle, so I'll go clean up for the
first commit.

> Incidentally, why do we not support an argument list here, as with
> ordinary triggers?

Left for a follow-up patch. Do you want it already in this one?

> I think the below hunk should get removed. Or else it should be
> surrounded by #ifdef rather than commented out.

Removed.

> Typo:

Fixed.

> psql is now very confused about what the slash command is. It's
> actually implemented as \dy, but the comments say \dev in several
> places, and slashUsage() documents it as \dct.

Fixed.

> InitializeEvtTriggerCommandCache still needs work. First, it ensures
> that CacheMemoryContext is non-NULL... after it's already stored the
> value of CacheMemoryContext into the HASHCTL. Obviously, the order
> there needs to be fixed. Also, I think you need to split this into
> two functions. There should be one function that gets called just
> once at startup time to CacheRegisterSyscacheCallback(), because we
> don't want to redo that every time the cache gets blown away. Then
> there should be another function that gets called when, and only when,
> someone needs to use the cache. That should create and populate the
> hash table.

CacheMemoryContext ordering fixed, I wrongly copied from attoptcache
here even when the memory management is not really done the same. The
only place I can see where to init the Event Trigger Cache is in
InitPostgres() itself (in src/backend/utils/init/postinit.c), done now.

> I think that all event triggers should fire in exactly alphabetical
> order by name. Now that we have the concept of multiple firing
> points, there's really no reason to distinguish between any triggers
> and triggers on specific commands. Eliminating that distinction will
> eliminate a bunch of complexity while making things *more* flexible
> for the user, who will be able to get a command trigger for a specific
> command to fire either before or after an ANY command trigger he has
> also defined rather than having the system enforce policy on him.

Internally we still need to keep the distinction to be able to fire ANY
triggers on otherwise non supported commands. I agree that we should not
concern our users with that, though. Done.

> plperl_validator still makes reference to CMDTRIGGER.

In a comment, fixed.

> Let's simplify this down to an enum with just one element, since
> that's all we're going to support for starters, and remove the related
> parser support for everything but command_start:
>
> +typedef enum TrigEvent
> +{
> + E_CommandStart = 1,
> + E_SecurityCheck = 10,
> + E_ConsistencyCheck = 15,
> + E_NameLookup = 20,
> + E_CommandEnd = 51
> +} TrigEvent;

I wanted to see where the current choice would lead us, I agree that we
can remove that level of details for now, that also allows not to pick
next event integration point names already :) Done.

> I think we should similarly rip out the vestigial support for
> supplying schema name, object name, and object ID. That's not going
> to be possible at command_start anyway; we can include that stuff in
> the patch that adds a later firing point (command end, or consistency
> check, perhaps).

We got this part of the API fixed last round, so I would prefer not to
dumb it down in this first patch. We know that we want to add exactly
that specification later, don't we?

> I think standard_ProcessUtility should have a shortcut that bypasses
> setting up the event context if there are no event triggers at all in
> the entire system, so that we do no extra work unless there's some
> potential benefit.

Setting up the event context is a very lightweight operation, and
there's no way to know if the command is going to fire an event trigger
without having done exactly what the InitEventContext() is doing. Maybe
what we need to do here is rename it?

Another problem with short cutting it aggressively is what happens if a
new event trigger is created while the command is in flight. We have yet
to discuss about that (as we only support a single timing point), but
doing it the way you propose forecloses any other choice than
"repeatable read" equivalent where we might want to have some "read
commited" behaviour, that is fire the new triggers if they appear while
the command is being run.

> It seems to me that we probably need a CommandCounterIncrement() after
> firing each event trigger, unless that's happening under the covers
> somewhere and I'm missing it. A good test case would be to have two
> event triggers. Have the first one insert a row into the table and
> check that the second one can see the row there. I suggest adding
> something like this to the regression tests.

Added CommandCounterIncrement() and a new regression test. That fails
for now, I'll have to get back to that later.

> Instead of having giant switch statements match E_WhatEver tags to
> strings and visca versa, I think it would be much better to create an
> array someplace that contains all the mappings. Then you can write a
> convenience function that scans the array for a string and returns the
> E_WhatEver tag, and another convenience function that scans the array
> for an E_WhatEver tag and returns the corresponding string. Then all
> the knowledge of how those things map onto each other is in ONE place,
> which should make things a whole lot easier in terms of future code
> maintenance, not to mention minimizing the chances of bugs of
> oversight in the patch as it stands. :-)

That means that the Enum definition can not jump from a number to
another non consecutive one, or that we have a very sparse array and
some way to fill it unknown to me. As those numbers are going to end up
on disk, we can not ever change them. I though it would be better to
mimic what we do with the NodeTag definition here.

> The comment changes in type_sanity.sql seem unnecessary. I think you
> can revert that part. There's also a one-line whitespace change in
> triggers.sql which can also be removed.

Done.

> With respect to this hunk, it seems to me that the verbiage is
> inaccurate. It will forbid the execution of any command for which
> event triggers are supported, whether they happen to be DDL commands
> or not. Also, a hypothetical DDL command that can't tolerate event
> triggers wouldn't be forbidden. I'm not sure exactly what the wording
> should look like here, but we should strive to be as exact as
> possible.
>
> + <para>
> + Forbids the execution of any DDL command:
> +
> +<programlisting>
> +CREATE OR REPLACE FUNCTION abort_any_command()
> + RETURNS event_trigger
> + LANGUAGE plpgsql
> + AS $$
> +BEGIN
> + RAISE EXCEPTION 'command % is disabled', tg_tag;
> +END;
> +$$;

Yeah, will rework that text. Not this late though, needs more brainpower.

> format_type_be_without_namespace is unused in this patch; please
> remove it for now.

Done.

> get_event_trigger_oid looks like it might be the source of your
> syscache reference leak. It would be a good idea to change the coding
> pattern of this function to match, say, get_foreign_server_oid. That
> would fix the leak and be more consistent.

Fixed meanwhile, that was it, thanks.

> I'm all reviewed out; hope that's enough for now. I hope you can get
> this cleaned up some more soon, as we are starting to run out of
> CommitFest and I would really like to get this in. Of course if we
> miss the CommitFest deadline I am happy to work on it in July and
> August but the sooner we get it done the better.

So, I've begun working on this already, and I intend to spend more time
on PostgreSQL development from next week on.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-06-29 21:35:34 Re: elog/ereport noreturn decoration
Previous Message Peter Eisentraut 2012-06-29 20:35:13 elog/ereport noreturn decoration