Re: Triggers, Stored Procedures to Aggregate table ?

From: Arvind Sharma <arvind321(at)yahoo(dot)com>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Triggers, Stored Procedures to Aggregate table ?
Date: 2010-07-08 19:49:52
Message-ID: 481023.73929.qm@web110116.mail.gq1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

The aggregated data is not required to be present immediately. May be few
minutes past the hour is just fine.

But the data needs to be continuously aggregated into hour/daily/weekly - since
the raw data in the minute table needs to be cleaned/deleted past the lookback
window (say, past 3 days). The idea being, that the user will always get info
on the aggregated data of the past hours and days ( e.g. 3 days) which they can
use to drill down to much granular data (if that is available and not deleted
yet) from the minute table.

I am leaning towards let the DB do the job of aggregation based on the
trigger/stored procedures and not the main business logic. Hopefully this will
keep the database round-trip and any other over-head to minimum - data
aggregated being very local to the source.

Arvind

________________________________
From: Joe Conway <mail(at)joeconway(dot)com>
To: Arvind Sharma <arvind321(at)yahoo(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Sent: Thu, July 8, 2010 10:08:12 AM
Subject: Re: [NOVICE] Triggers, Stored Procedures to Aggregate table ?

On 07/08/2010 07:27 AM, Arvind Sharma wrote:
> I have few tables which stores raw data on minute basis. I want to
> aggregate this data into another table to store every hour worth of
> data. And from there on - from this hourly Aggregated table, want to
> store into another Aggregate Table for a day's worth of data.
>
> You got the direction I am going with this.. :-).... Hourly, Daily,
> Weekly aggregated data into their respective tables.
>
> I could write some Java code to run periodically on these tables to
> transform them into Aggregate tables but that would have the overhead
> (Network, Disk I/O). I am wondering if there is any easy way to be
> able to write something at the Postgres level, where some Trigger will
> call some Stored Procedure on a particular table which will do the
> Aggregate (min, max, avg) and store that into a new table.

No matter what you do there is going to be overhead -- you just have to
decide when is the most appropriate or least intrusive time to incur
that overhead. Few questions come to mind:

1) Do you need immediate access to the most recent data, or can you
batch up data and live with, for example, always having the last
completed hour available?

2) Do you need continuous aggregation (e.g. the average for current
hour so far, the average for current day so far) or do you only want
aggregation of completed time periods (last hour's average,
yesterday's average, etc.)?

Over the years I have done something similar to what you describe in at
least fours ways:

1) Aggregate on demand
2) Batch aggregate on a periodic basis -- e.g. run your aggregate query
with a cron job which truncates and rebuilds a table (i.e. a
"materialized view")
3) Write a C based trigger that does "continuous aggregation" to a
materialized table
4) Write a C based bulk loader that aggregates as it bulk loads the raw
data into a materialized table

HTH,

Joe

--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & Support

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Andre Majorel 2010-07-08 20:19:26 Bypassing authentication
Previous Message Joe Conway 2010-07-08 17:08:12 Re: Triggers, Stored Procedures to Aggregate table ?