Re: delta relations in AFTER triggers

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Amit Khandekar <amit(dot)khandekar(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: delta relations in AFTER triggers
Date: 2014-08-12 14:39:25
Message-ID: 1407854365.90730.YahooMailNeo@web122303.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit Khandekar <amit(dot)khandekar(at)enterprisedb(dot)com> wrote:
> On 7 August 2014 19:49, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
>> Amit Khandekar <amit(dot)khandekar(at)enterprisedb(dot)com> wrote:

>>> I tried to google some SQLs that use REFERENCING clause with triggers.
>>> It looks like in some database systems, even the WHEN clause of CREATE TRIGGER
>>> can refer to a transition table, just like how it refers to NEW and
>>> OLD row variables.
>>>
>>> For e.g. :
>>> CREATE TRIGGER notify_dept
>>>   AFTER UPDATE ON weather
>>>   REFERENCING NEW_TABLE AS N_TABLE
>>>   NEW AS N_ROW
>>>   FOR EACH ROW
>>>   WHEN ((SELECT AVG (temperature) FROM N_TABLE) > 10)
>>>   BEGIN
>>>     notify_department(N_ROW.temperature, N_ROW.city);
>>>   END
>>>
>>> Above, it is used to get an aggregate value of all the changed rows. I think
>>> we do not currently support aggregate expressions in the where clause, but with
>>> transition tables, it makes more sense to support it later if not now.
>>
>> Interesting point; I had not thought about that.  Will see if I can
>> include support for that in the patch for the next CF; failing
>> that; I will at least be careful to not paint myself into a corner
>> where it is unduly hard to do later.
> We currently do the WHEN checks while saving the AFTER trigger events,
> and also add the tuples one by one while saving the trigger events. If
> and when we support WHEN, we would need to make all of these tuples
> saved *before* the first WHEN clause execution, and that seems to
> demand more changes in the current trigger code.

In that case my inclination is to get things working with the less
invasive patch that doesn't try to support transition table usage
in WHEN clauses, and make support for that a later patch.

> ---------------
>
> Are we later going to extend this support for constraint triggers as
> well ? I think these variables would make sense even for deferred
> constraint triggers. I think we would need some more changes if we
> want to support this, because there is no query depth while executing
> deferred triggers and so the tuplestores might be inaccessible with
> the current design.

Hmm, I would also prefer to exclude that from an initial patch, but
this and the WHEN clause issue may influence a decision I've been
struggling with.  This is my first non-trivial foray into the
planner territory, and I've been struggling with how best to bridge
the gap between where the tuplestores are *captured* in the trigger
code and where they are referenced by name in a query and
incorporated into a plan for the executor.  (The execution level
itself was almost trivial; it's getting the tuplestore reference
through the parse analysis and planning phases that is painful for
me.)  At one point I create a "tuplestore registry" using a
process-local hashmap where each Tuplestorestate and its associated
name, TupleDesc, etc. would be registered, yielding a Tsrid (based
on Oid) to use through the parse analysis and planning steps, but
then I ripped it back out again in favor of just passing the
pointer to the structure which was stored in the registry; because
the registry seemed to me to introduce more risk of memory leaks,
references to freed memory, etc.  While it helped a little with
abstraction, it seemed to make things more fragile.  But I'm still
torn on this, and unsure whether such a registry is a good idea.

Any thoughts on that?

> ---------------
>
> The following (and similarly other) statements :
> trigdesc->trig_insert_new_table |=
> (TRIGGER_FOR_INSERT(tgtype) &&
> TRIGGER_USES_TRANSITION_TABLE(trigger->tgnewtable)) ? true : false;
>
> can be simplfied to :
>
> trigdesc->trig_insert_new_table |=
> (TRIGGER_FOR_INSERT(tgtype) &&
> TRIGGER_USES_TRANSITION_TABLE(trigger->tgnewtable));

Yeah, I did recognize that, but I always get squeamish about
logical operations with values which do not equal true or false.
TRIGGER_FOR_INSERT and similar macros don't necessarily return true
for something which is not false.  I should just get over that and
trust the compiler a bit more....  :-)

> ---------------
>
> AfterTriggerExecute()
> {
> .....
> /*
> * Set up the tuplestore information.
> */
> if (trigdesc->trig_delete_old_table || trigdesc->trig_update_old_table)
> LocTriggerData.tg_olddelta =
> GetCurrentTriggerDeltaTuplestore(afterTriggers->old_tuplestores);
> .....
> }
> Above, trigdesc->trig_update_old_table is true if at least one of the
> triggers of the table has transition tables. So this will cause the
> delta table to be available on all of the triggers of the table even
> if only one out of them uses transition tables. May be this would be
> solved by using LocTriggerData.tg_trigger->tg_olddelta rather than
> trigdesc->trig_update_old_table to decide whether
> LocTriggerData.tg_olddelta should be assigned.

Good point.  Will do.

> ---------------
>
> GetCurrentTriggerDeltaTuplestore() is now used for getting fdw
> tuplestore also, so it should have a more general name.

Well, "delta" *was* my attempt at a more general name.  I need to
do another pass over the naming choices to make sure I'm being
consistent, but I attempted to use these distinctions:

  transition - OLD or NEW, ROW or TABLE data for a trigger; this is
  the terminology used in the SQL standard

  oldtable/newtable - transition data for a relation representing
  what a statement affected; corresponding to the REFERENCING
  [OLD|NEW] TABLE clauses in the SQL standard

  tuplestore - for the planner and executor, since we may find
  other uses for feeding in tuplestores; I don't want to assume in
  the naming there that these are from triggers at all

  delta - a tuplestore representing changes to data; perhaps this
  is too close in concept to transition in the trigger code since
  there is no other source for delta data and the naming should
  just  use transition

Any specific suggestions?  Maybe eliminate use of "delta" and make
that GetCurrentTriggerTransitionTuplestore()?

> ---------------
>
> #define TRIGGER_USES_TRANSITION_TABLE(namepointer) \
> ((namepointer) != (char *) NULL && (*(namepointer)) != '\0')
> Since all other code sections assume trigger->tgoldtable to be
> non-NULL, we can skip the NULL check above.

I intentionally left both tests in initially because I wasn't sure
which representation would be used.  Will review whether it is time
to get off the fence on that.  ;-)

> ---------------
>
> We should add a check to make sure the user does not supply same name
> for OLD TABLE and NEW TABLE.

I already caught that and implemented a check in my development code.

> ---------------
>
> The below code comment needs to be changed.
> * Only tables are initially supported, and only for AFTER EACH STATEMENT
> * triggers, but other permutations are accepted by the parser so we can give
> * a meaningful message from C code.
> The comment implies that with ROW triggers we do not support TABLE
> transition variables. But the patch does make these variables visible
> through ROW triggers.

Oops.  Will fix.

Thanks for the review!

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Claudio Freire 2014-08-12 14:40:07 Re: Proposal: Incremental Backup
Previous Message Andres Freund 2014-08-12 14:30:00 Re: Proposal: Incremental Backup