|From:||Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>|
|To:||Noah Misch <noah(at)leadboat(dot)com>|
|Cc:||pgsql-hackers(at)postgresql(dot)org, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>|
|Subject:||Re: Triggers on foreign tables|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
Thank you for this new review, please find attached a revised patch.
Le jeudi 30 janvier 2014 14:05:08 Noah Misch a écrit :
> On Thu, Jan 23, 2014 at 03:17:35PM +0100, Ronan Dunklau wrote:
> > What do you think about this approach ? Is there something I missed which
> > would make it not sustainable ?
> Seems basically reasonable. I foresee multiple advantages from having one
> tuplestore per query level as opposed to one for the entire transaction.
> You would remove the performance trap of backing up the tuplestore by
> rescanning. It permits reclaiming memory and disk space in
> AfterTriggerEndQuery() rather than at end of transaction. You could remove
> ate_ptr1 and ate_ptr2 from AfterTriggerEventDataFDW and just store the
> flags word: depending on AFTER_TRIGGER_2CTIDS, grab either the next one or
> the next two tuples from the tuplestore. Using work_mem per
> AfterTriggerBeginQuery() instead of per transaction is no problem. What do
> you think of that design change?
I agree that this design is better, but I have some objections.
We can remove ate_ptr2 and rely on the AFTER_TRIGGER_2CTIDS flag, but the
rescanning and ate_ptr1 (renamed ate_tupleindex in the attached patch) can't
Consider for example the case of a foreign table with more than one AFTER
UPDATE triggers. Unless we store the tuples once for each trigger, we will
have to rescan the tuplestore.
To mitigate the effects of this behaviour, I added the option to perform a
reverse_seek when the looked-up tuple is nearer from the current index than
from the start.
> Probably best to create the tuplestore lazily, similar to how we initialize
> afterTriggers->event_cxt. tuplestore_begin_heap() is almost cheap enough to
> call unconditionally, but relatively few queries will use this.
The per-query tuplestore design stores one pointer per FDWTuplestore struct.
This struct is only used to keep track of the current READ and WRITE pointer
positions along with the tuplestore itself.
The actual structure will be initialized only if the query needs it.
> If you do pursue that change, make sure the code still does the right thing
> when it drops queued entries during subxact abort.
I don't really understand what should be done at that stage. Since triggers on
foreign tables are not allowed to be deferred, everything should be cleaned up
at the end of each query, right ? So, there shouldn't be any queued entries.
> > I do not have access to the standard specification, any advice regarding
> > specs compliance would be welcomed.
Thank you. I did not find anything more than what you noted. I think that even
if the bit about foreign tables being mentioned from a column list, and the
fact that a DROP TABLE should not create a new trigger execution context are
confusing, the fact that the CREATE TRIGGER definition explicitly mentions
either base tables or views makes me think that foreign tables are not
> You're effectively faking the presence of a RETURNING list so today's
> conforming FDWs will already do the right thing? Clever.
Yes, that's it. I hope I didn't introduce any side effects with regards to the
meaning of the hasReturning flag.
> Note that pgindent can't fix many unrelated whitespace changes. For
> example, if you add or remove a blank line, pgindent won't interfere. (We
> would not want it to interfere, because the use of blank lines is up to the
> code author.) You will still need to read through your diff for such
> > The attached patch checks this, and add documentation for this limitation.
> > I'm not really sure about how to phrase that correctly in the error
> > message
> > and the documentation. One can store at most INT_MAX foreign tuples, which
> > means that at most INT_MAX insert or delete or "half-updates" can occur.
> > By
> > half-updates, I mean that for update two tuples are stored whereas in
> > contrast to only one for insert and delete trigger.
> > Besides, I don't know where this disclaimer should be in the
> > documentation.
> > Any advice here ?
> I wouldn't mention that limitation.
Maybe it shouldn't be so prominent, but I still think a note somewhere
Should the use of work_mem be documented somewhere, too ?
> This is the performance trap I mentioned above; it brings potential O(n^2)
> complexity to certain AFTER trigger execution scenarios.
What scenarios do you have in mind ? I "fixed" the problem when there are
multiple triggers on a foreign table, do you have any other one ?
> Check "isNull", just in case. See similar code elsewhere in this file.
> This comment edit looks half-done.
|Next Message||Peter Geoghegan||2014-02-02 11:44:12||Re: pgsql: Include planning time in EXPLAIN ANALYZE output.|
|Previous Message||Heikki Linnakangas||2014-02-02 10:45:37||Re: GIN improvements part2: fast scan|