Re: Triggers on foreign tables

From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Noah Misch <noah(at)leadboat(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Subject: Re: Triggers on foreign tables
Date: 2014-02-04 12:16:22
Message-ID: 6106851.JNBVX4ESVC@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le lundi 3 février 2014 23:28:45 Noah Misch a écrit :
> On Sun, Feb 02, 2014 at 11:53:51AM +0100, Ronan Dunklau wrote:
> > Le jeudi 30 janvier 2014 14:05:08 Noah Misch a écrit :
> > > On Thu, Jan 23, 2014 at 03:17:35PM +0100, Ronan Dunklau wrote:
> > > > What do you think about this approach ? Is there something I missed
> > > > which
> > > > would make it not sustainable ?
> > >
> > > Seems basically reasonable. I foresee multiple advantages from having
> > > one
> > > tuplestore per query level as opposed to one for the entire transaction.
> > > You would remove the performance trap of backing up the tuplestore by
> > > rescanning. It permits reclaiming memory and disk space in
> > > AfterTriggerEndQuery() rather than at end of transaction. You could
> > > remove
> > > ate_ptr1 and ate_ptr2 from AfterTriggerEventDataFDW and just store the
> > > flags word: depending on AFTER_TRIGGER_2CTIDS, grab either the next one
> > > or
> > > the next two tuples from the tuplestore. Using work_mem per
> > > AfterTriggerBeginQuery() instead of per transaction is no problem. What
> > > do
> > > you think of that design change?
> >
> > I agree that this design is better, but I have some objections.
> >
> > We can remove ate_ptr2 and rely on the AFTER_TRIGGER_2CTIDS flag, but the
> > rescanning and ate_ptr1 (renamed ate_tupleindex in the attached patch)
> > can't go away.
> >
> > Consider for example the case of a foreign table with more than one AFTER
> > UPDATE triggers. Unless we store the tuples once for each trigger, we will
> > have to rescan the tuplestore.
>
> Will we? Within a given query level, when do (non-deferred) triggers
> execute in an order other than the enqueue order?

Let me explain what I had in mind.

Looking at the code in AfterTriggerSaveEvent:

- we build a "template" AfterTriggerEvent, and store the tuple(s)
- for each suitable after trigger that matches the trigger type, as well as
the WHEN condition if any, a copy of the previously built AfterTriggerEvent is
queued

Later, those events are fired in order.

This means that more than one event can be fired for one tuple.

Take this example:

CREATE TRIGGER trig_row_after1
AFTER UPDATE ON rem2
FOR EACH ROW
WHEN (NEW.f1 % 5 < 3)
EXECUTE PROCEDURE trigger_func('TRIG1');

CREATE TRIGGER trig_row_after2
AFTER UPDATE ON rem2
FOR EACH ROW
WHEN (NEW.f1 % 5 < 4)
EXECUTE PROCEDURE trigger_func('TRIG2');

UPDATE rem2 set f2 = 'something';

Assuming 5 rows with f1 as a serial, the fired AfterTriggerEvent's
ate_tupleindex will be, in that order. Ass

0-0-2-2-4-8-8

So, at least a backward seek is required for trig_row_after2 to be able to
retrieve a tuple that was already consumed when firing trig_row_after1.

On a side note, this made me realize that it is better to avoid storing a
tuple entirely if there is no enabled trigger (the f1 = 4 case above). The
attached patch does that, so the previous sequence becomes:

0-0-2-2-4-6-6

It also prevents from initalizing a tuplestore at all if its not needed.

>
> > To mitigate the effects of this behaviour, I added the option to perform a
> > reverse_seek when the looked-up tuple is nearer from the current index
> > than
> > from the start.
>
> If there's still a need to seek within the tuplestore, that should get rid
> of the O(n^2) effect. I'm hoping that per-query-level tuplestores will
> eliminate the need to seek entirely.

I think the only case when seeking is still needed is when there are more than
one after trigger that need to be fired, since the abovementioned change
prevents from seeking to skip tuples.

>
> > > If you do pursue that change, make sure the code still does the right
> > > thing
> > > when it drops queued entries during subxact abort.
> >
> > I don't really understand what should be done at that stage. Since
> > triggers on foreign tables are not allowed to be deferred, everything
> > should be cleaned up at the end of each query, right ? So, there
> > shouldn't be any queued entries.
> I suspect that's right. If you haven't looked over
> AfterTriggerEndSubXact(), please do so and ensure all its actions still
> make sense in the context of this new kind of trigger storage.

You're right, I missed something here. When aborting a subxact, the
tuplestores for queries below the subxact query depth should be cleaned, if
any, because AfterTriggerEndQuery has not been called for the failing query.

The attached patch fixes that.

>
> > > > The attached patch checks this, and add documentation for this
> > > > limitation.
> > > > I'm not really sure about how to phrase that correctly in the error
> > > > message
> > > > and the documentation. One can store at most INT_MAX foreign tuples,
> > > > which
> > > > means that at most INT_MAX insert or delete or "half-updates" can
> > > > occur.
> > > > By
> > > > half-updates, I mean that for update two tuples are stored whereas in
> > > > contrast to only one for insert and delete trigger.
> > > >
> > > > Besides, I don't know where this disclaimer should be in the
> > > > documentation.
> > > > Any advice here ?
> > >
> > > I wouldn't mention that limitation.
> >
> > Maybe it shouldn't be so prominent, but I still think a note somewhere
> > couldn't hurt.
>
> Perhaps. There's not much documentation of such implementation upper
> limits, and there's no usage of "INT_MAX" outside of parts that discuss
> writing C code. I'm not much of a visionary when it comes to the
> documentation; I try to document new things with an amount of detail
> similar to old features.

Ok, I removed the paragraph documenting the limitation.

> > Should the use of work_mem be documented somewhere, too ?
>
> I wouldn't. Most uses of work_mem are undocumented, even relatively major
> ones like count(DISTINCT ...) and CTEs. So, while I'd generally favor a
> patch documenting all/most of the things that use work_mem, it would be odd
> to document one new consumer apart from the others.

Ok.

>
> > > This is the performance trap I mentioned above; it brings potential
> > > O(n^2)
> > > complexity to certain AFTER trigger execution scenarios.
> >
> > What scenarios do you have in mind ? I "fixed" the problem when there are
> > multiple triggers on a foreign table, do you have any other one ?
>
> I'm not aware of such a performance trap in your latest design.

Good !

> Thanks,
> nm

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org

Attachment Content-Type Size
foreign_trigger_v8.patch text/x-patch 58.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-02-04 12:17:35 Re: should we add a XLogRecPtr/LSN SQL type?
Previous Message Michael Paquier 2014-02-04 12:04:13 Re: should we add a XLogRecPtr/LSN SQL type?