Re: Triggers on foreign tables

From: Noah Misch <noah(at)leadboat(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Subject: Re: Triggers on foreign tables
Date: 2014-02-04 04:28:45
Message-ID: 20140204042845.GB2391702@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 02, 2014 at 11:53:51AM +0100, Ronan Dunklau wrote:
> Le jeudi 30 janvier 2014 14:05:08 Noah Misch a écrit :
> > On Thu, Jan 23, 2014 at 03:17:35PM +0100, Ronan Dunklau wrote:
> > > What do you think about this approach ? Is there something I missed which
> > > would make it not sustainable ?
> >
> > Seems basically reasonable. I foresee multiple advantages from having one
> > tuplestore per query level as opposed to one for the entire transaction.
> > You would remove the performance trap of backing up the tuplestore by
> > rescanning. It permits reclaiming memory and disk space in
> > AfterTriggerEndQuery() rather than at end of transaction. You could remove
> > ate_ptr1 and ate_ptr2 from AfterTriggerEventDataFDW and just store the
> > flags word: depending on AFTER_TRIGGER_2CTIDS, grab either the next one or
> > the next two tuples from the tuplestore. Using work_mem per
> > AfterTriggerBeginQuery() instead of per transaction is no problem. What do
> > you think of that design change?
> >
>
> I agree that this design is better, but I have some objections.
>
> We can remove ate_ptr2 and rely on the AFTER_TRIGGER_2CTIDS flag, but the
> rescanning and ate_ptr1 (renamed ate_tupleindex in the attached patch) can't
> go away.
>
> Consider for example the case of a foreign table with more than one AFTER
> UPDATE triggers. Unless we store the tuples once for each trigger, we will
> have to rescan the tuplestore.

Will we? Within a given query level, when do (non-deferred) triggers execute
in an order other than the enqueue order?

> To mitigate the effects of this behaviour, I added the option to perform a
> reverse_seek when the looked-up tuple is nearer from the current index than
> from the start.

If there's still a need to seek within the tuplestore, that should get rid of
the O(n^2) effect. I'm hoping that per-query-level tuplestores will eliminate
the need to seek entirely.

> > If you do pursue that change, make sure the code still does the right thing
> > when it drops queued entries during subxact abort.
>
> I don't really understand what should be done at that stage. Since triggers on
> foreign tables are not allowed to be deferred, everything should be cleaned up
> at the end of each query, right ? So, there shouldn't be any queued entries.

I suspect that's right. If you haven't looked over AfterTriggerEndSubXact(),
please do so and ensure all its actions still make sense in the context of
this new kind of trigger storage.

> > > The attached patch checks this, and add documentation for this limitation.
> > > I'm not really sure about how to phrase that correctly in the error
> > > message
> > > and the documentation. One can store at most INT_MAX foreign tuples, which
> > > means that at most INT_MAX insert or delete or "half-updates" can occur.
> > > By
> > > half-updates, I mean that for update two tuples are stored whereas in
> > > contrast to only one for insert and delete trigger.
> > >
> > > Besides, I don't know where this disclaimer should be in the
> > > documentation.
> > > Any advice here ?
> >
> > I wouldn't mention that limitation.
>
> Maybe it shouldn't be so prominent, but I still think a note somewhere
> couldn't hurt.

Perhaps. There's not much documentation of such implementation upper limits,
and there's no usage of "INT_MAX" outside of parts that discuss writing C
code. I'm not much of a visionary when it comes to the documentation; I try
to document new things with an amount of detail similar to old features.

> Should the use of work_mem be documented somewhere, too ?

I wouldn't. Most uses of work_mem are undocumented, even relatively major
ones like count(DISTINCT ...) and CTEs. So, while I'd generally favor a patch
documenting all/most of the things that use work_mem, it would be odd to
document one new consumer apart from the others.

> > This is the performance trap I mentioned above; it brings potential O(n^2)
> > complexity to certain AFTER trigger execution scenarios.
>
> What scenarios do you have in mind ? I "fixed" the problem when there are
> multiple triggers on a foreign table, do you have any other one ?

I'm not aware of such a performance trap in your latest design. For what it's
worth, I don't think multiple triggers were ever a problem. In the previous
design, a problem arose if you had a scenario like this:

Query level 1: queue one million events
...
Repeat this section many times:
Query level 2: queue one event
Query level 3: queue one event
Query level 3: execute events
Query level 2: execute events <-- had to advance through the 1M stored events
...
Query level 1: execute events

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2014-02-04 04:38:37 could not create IPv6 socket (AI_ADDRCONFIG)
Previous Message Tatsuo Ishii 2014-02-04 04:06:37 Inconsistency between pg_stat_activity and log_duration