Re: Aggregate query for multiple records

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Scott Gerhardt <scott(at)g-it(dot)ca>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Aggregate query for multiple records
Date: 2004-08-27 18:59:26
Message-ID: 200408271159.26445.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Scott,

> >> I tried your query but it doesn't seem to work. The interpreter
> >> expects prd2.date to be grouped but this doesn't make sence, see
> >> below:
> >
> > Oh, yeah, darn it.
> >
> > Hmmm ... maybe you could explain the purpose of selecting just 6?
> > This query
> > is going to get a *lot* uglier if there isn't another way to
> > accomplish it.
>
> The purpose of selecting the first 6 is we need sum values for just the
> first 3, 6 and 12 months oil/water production for every oil well in the
> database to perform statistical calculations. Optionally, we also need
> the _last_ 6 months of oil and water production.
> The prd_data table has total oil for each month but some wells have 500
> months of data, we just want the first 6.

Well, here's my suggestion, to make this query both easier and faster:

1) Add a column called "month_prod", integer, to the table. This is the
"number of months in production".
2) Populate this query with a calculated difference on your "date" column
against the first appearance of that WID (I'm assuming that each increment of
"date" = 1 month)

UPDATE prd_data SET months_prod = prd_data."date" - prd2."date" + 1
FROM prd_data prd2
WHERE prd_data.wid = prd2.wid
AND prd2."date" = ( SELECT "date" FROM prd_data prd3
WHERE prd3.wid = prd2.wid
ORDER BY "date" LIMIT 1 );

(warning: the above will take a long time and floor your server. Make sure to
do VACUUM FULL ANALYZE prd_data afterwards.)
(if you can't get the above to complete, which is possible depending on your
hardware, you could do it procedurally in a programmming language)

3) Then running your aggregate becomes very easy/fast:

SELECT wid, sum(oil) as tot_oil, sum(hours) as tot_hours
FROM prd_data
WHERE months_prod < 7
GROUP BY wid
ORDER BY wid;

--
Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Bruno Wolff III 2004-08-27 19:20:32 Re: GRANT ON all tables
Previous Message Josh Berkus 2004-08-27 16:02:20 Re: from PG_DUMP to CVS