Re: Writing Trigger Functions in C

From: Charles Gomes <charlesrg(at)outlook(dot)com>
To: Christopher Browne <cbbrowne(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Writing Trigger Functions in C
Date: 2012-12-24 15:43:04
Message-ID: BLU002-W484BC4795C044CBF2AC8D8AB3B0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

----------------------------------------
> From: charlesrg(at)outlook(dot)com
> To: cbbrowne(at)gmail(dot)com
> CC: pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] Writing Trigger Functions in C
> Date: Fri, 21 Dec 2012 12:27:26 -0500
>
> ________________________________
> > Date: Fri, 21 Dec 2012 11:56:25 -0500
> > Subject: Re: [HACKERS] Writing Trigger Functions in C
> > From: cbbrowne(at)gmail(dot)com
> > To: charlesrg(at)outlook(dot)com
> > CC: pgsql-hackers(at)postgresql(dot)org
> >
> > On Fri, Dec 21, 2012 at 11:25 AM, Charles Gomes
> > <charlesrg(at)outlook(dot)com<mailto:charlesrg(at)outlook(dot)com>> wrote:
> > >
> > > Hello guys,
> > >
> > > I've been finding performance issues when using a trigger to modify
> > inserts on a partitioned table.
> > > If using the trigger the total time goes from 1 Hour to 4 hours.
> > >
> > > The trigger is pretty simple:
> > >
> > > CREATE OR REPLACE FUNCTION quotes_insert_trigger()
> > > RETURNS trigger AS $
> > > BEGIN
> > > EXECUTE 'INSERT INTO quotes_'||
> > to_char(new.received_time,'YYYY_MM_DD') ||' VALUES (($1).*)' USING NEW
> > ;
> > > RETURN NULL;
> > > END;
> > > $
> > > LANGUAGE plpgsql;
> > >
> > > I've seen that some of you guys have worked on writing triggers in C.
> > >
> > > Does anyone have had an experience writing a trigger for partitioning
> > in C ?
> >
> > I'd want to be very careful about assuming that implementing the
> > trigger function in C
> > would necessarily improve performance. It's pretty likely that it
> > wouldn't help much,
> > as a fair bit of the cost of firing a trigger have to do with figuring
> > out which function to
> > call, marshalling arguments, and calling the function, none of which would
> > magically disappear by virtue of implementing in C.
> >
> > A *major* cost that your existing implementation has is that it's re-planning
> > the queries for every single invocation. This is an old, old problem
> > from the
> > Lisp days, "EVAL considered evil"
> > <http://stackoverflow.com/questions/2571401/why-exactly-is-eval-evil>
> >
> > The EXECUTE winds up replanning queries every time the trigger fires.
> >
> > If you can instead enumerate the partitions explicitly, putting them
> > into (say) a
> > CASE clause, the planner could generate the plan once, rather than a million
> > times, which would be a HUGE savings, vastly greater than you could
> > expect from
> > recoding into C.
> >
> > The function might look more like:
> >
> > create or replace function quotes_insert_trigger () returns trigger as $$
> > declare
> > c_rt text;
> > begin
> > c_rt := to_char(new.received_time, 'YYYY_MM_DD');
> > case c_rt
> > when '2012_03_01' then
> > insert into 2012_03_01 values (NEW.*) using new;
> > when '2012_03_02' then
> > insert into 2012_03_02 values (NEW.*) using new;
> > else
> > raise exception 'Need a new partition function for %', c_rt;
> > end case;
> > end $$ language plpgsql;
> >
> > You'd periodically need to change the function to reflect the existing set of
> > partitions, but that's cheaper than creating a new partition.
> >
> > The case statement gets more expensive (in effect O(n) on the number of
> > partitions, n) as the number of partitions increases. You could split
> > the date into pieces (e.g. - years, months, days) to diminish that cost.
> >
> > But at any rate, this should be *way* faster than what you're running now,
> > and not at any heinous change in development costs (as would likely
> > be the case reimplementing using SPI).
> > --
> > When confronted by a difficult problem, solve it by reducing it to the
> > question, "How would the Lone Ranger handle this?"
>
>
> I will change and implement it this way, I was not aware of such optimization.
> Will post back after my benchmark runs.
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

I've to have add 2 weeks of data at a time, therefore I had to keep two weeks of case statements
Replaced the short trigger function to:

CREATE OR REPLACE FUNCTION quotes_insert_trigger()
RETURNS trigger AS $$
DECLARE
r_date text;
BEGIN
r_date = to_char(new.received_time, 'YYYY_MM_DD');
case r_date
    when '2012_09_10' then
        insert into quotes_2012_09_10 values (NEW.*) using new;
        return;
    when '2012_09_11' then
        insert into quotes_2012_09_11 values (NEW.*) using new;
        return;
    when '2012_09_12' then
        insert into quotes_2012_09_12 values (NEW.*) using new;
        return;
    when '2012_09_13' then
        insert into quotes_2012_09_13 values (NEW.*) using new;
        return;
    when '2012_09_14' then
        insert into quotes_2012_09_14 values (NEW.*) using new;
        return;
    when '2012_09_15' then
        insert into quotes_2012_09_15 values (NEW.*) using new;
        return;
    when '2012_09_16' then
        insert into quotes_2012_09_16 values (NEW.*) using new;
        return;
    when '2012_09_17' then
        insert into quotes_2012_09_17 values (NEW.*) using new;
        return;
    when '2012_09_18' then
        insert into quotes_2012_09_18 values (NEW.*) using new;
        return;
    when '2012_09_19' then
        insert into quotes_2012_09_19 values (NEW.*) using new;
        return;
    when '2012_09_20' then
        insert into quotes_2012_09_20 values (NEW.*) using new;
        return;
    when '2012_09_21' then
        insert into quotes_2012_09_21 values (NEW.*) using new;
        return;
    when '2012_09_22' then
        insert into quotes_2012_09_22 values (NEW.*) using new;
        return;
    when '2012_09_23' then
        insert into quotes_2012_09_23 values (NEW.*) using new;
        return;
    when '2012_09_24' then
        insert into quotes_2012_09_24 values (NEW.*) using new;
        return;
end case
RETURN NULL;
END;
$$
LANGUAGE plpgsql;

And I had no performance improvements at all.
Took the same time as with the previous EXECUTE statement;

I don't see what am I doing wrong.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2012-12-24 15:48:17 Re: Feature Request: pg_replication_master()
Previous Message Simon Riggs 2012-12-24 15:13:59 Re: Feature Request: pg_replication_master()