Re: Triggers, Stored Procedures to Aggregate table ?

From: Arvind Sharma <arvind321(at)yahoo(dot)com>
To: Michael Swierczek <mike(dot)swierczek(at)gmail(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Triggers, Stored Procedures to Aggregate table ?
Date: 2010-07-12 21:29:22
Message-ID: 740492.37748.qm@web110106.mail.gq1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Thanks Mike !

This definitely will help me getting kick start the whole process.

Your suggestion of running these PL/pgsql functions withing a Java Timer ( I am
on JDK 1.6) or Scheduled service is worth investigating as you mentioned,
otherwise the Stored Procedure will have to do some validation every time.

Thanks!

Arvind

________________________________
From: Michael Swierczek <mike(dot)swierczek(at)gmail(dot)com>
To: Arvind Sharma <arvind321(at)yahoo(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Sent: Mon, July 12, 2010 11:22:07 AM
Subject: Re: [NOVICE] Triggers, Stored Procedures to Aggregate table ?

On Wed, Jul 7, 2010 at 1:06 PM, Arvind Sharma <arvind321(at)yahoo(dot)com> wrote:
> Hi,
>
> Very new to the Triggers and Sprocs.
>
> I have few tables which stores raw data on minute basis. I want to aggregate
> this data into another table to store every hour worth of data. And from
> there on - from this hourly Aggregated table, want to store into another
> Aggregate Table for a day's worth of data.
>
> You got the direction I am going with this.. :-).... Hourly, Daily, Weekly
> aggregated data into their respective tables.
>
> I could write some Java code to run periodically on these tables to
> transform them into Aggregate tables but that would have the overhead
> (Network, Disk I/O). I am wondering if there is any easy way to be able to
> write something at the Postgres level, where some Trigger will call some
> Stored Procedure on a particular table which will do the Aggregate (min,
> max, avg) and store that into a new table.
>
>
> Any pointers or suggestions or examples would be highly appreciated !
>
>
> Thanks!
> Arvind
>

I'm not much past Postgres novice level myself, but I would do this as
a PL/pgsql function that is called by a Scheduled Task or Service (if
you are using Windows) or a cron (if you are using Unix) to call it
periodically. Then you get periodic execution of the task but the
processing work happens at the server instead of needing an external
network call to the database from another machine.

If you do it as a table trigger, then every time the trigger executes
it needs to make sure the data for the last minute, hour, day, week,
and so forth is not already in the appropriate table. I imagine
that's needless extra calculation that does not need to be run.

Here's a sample I wrote that you may be able to extend to meet your
needs. Good luck.
-Mike

create table item (
id integer not null,
d_val double precision not null,
date_added timestamptz not null,
constraint item_pk primary key (id));

create sequence item_minute_id_seq;

create table item_minute(
id integer not null,
d_max double precision,
d_min double precision,
d_avg double precision,
date_added timestamptz not null,
constraint item_minute_pk primary key (id));

create or replace function collect_aggregate_minute_data() returns integer as
$$
DECLARE
max_val double precision;
min_val double precision;
avg_val double precision;
last_minute timestamptz;
sql_string text;
BEGIN
-- get the most recent minute
last_minute := date_trunc('minute', now());
perform * from item_minute where date_added = last_minute;
if (found = true) then
raise exception 'Data was already added for this minute.';
end if;
select into max_val max (d_val) from item where date_added >
last_minute - interval'1 minute' and
date_added <= last_minute;
select into min_val min (d_val) from item where date_added >
last_minute - interval'1 minute' and
date_added <= last_minute;
select into avg_val avg (d_val) from item where date_added >
last_minute - interval'1 minute' and
date_added <= last_minute;
insert into item_minute (id, d_max, d_min, d_avg, date_added)
values (nextval('item_minute_id_seq'), max_val, min_val,
avg_val, last_minute);
return 1;
END;
$$
language 'plpgsql';

insert into item values (1, 1.0, now());
insert into item values (2, 2.0, now());
insert into item values (3, 3.0, now());

select collect_aggregate_minute_data();
-- if you run the above line again too quickly, it will give an error

select * from item_minute;

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Greg 2010-07-13 12:56:34 Extended Query Protocol Question
Previous Message Michael Swierczek 2010-07-12 18:22:07 Re: Triggers, Stored Procedures to Aggregate table ?