Re: Aggregate query for multiple records

From: Scott Gerhardt <scott(at)g-it(dot)ca>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, pgsql-sql(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Aggregate query for multiple records
Date: 2004-08-28 18:08:58
Message-ID: 55390B9E-F91D-11D8-A763-000393801C60@g-it.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql


> Greg Stark <gsstark(at)mit(dot)edu> writes:
>> [ nice example snipped ]
>> ... Also, you'll have to change it to use reals.
>
> That part, at least, can be worked around as of 7.4: use polymorphic
> functions. You can declare the functions and aggregate as working on
> anyelement/anyarray, and then they will automatically work on any
> datatype that has a + operator.
>
> regression=# create or replace function first_6_accum
> (anyarray,anyelement) returns anyarray
> regression-# language sql immutable as 'select case when
> array_upper($1,1)>=6 then $1 else $1||$2 end';
> CREATE FUNCTION
> regression=# create function sum_6(anyarray) returns anyelement
> immutable language sql as 'select
> $1[1]+$1[2]+$1[3]+$1[4]+$1[5]+$1[6]';
> CREATE FUNCTION
> regression=# create aggregate sum_first_6 (basetype=anyelement,
> sfunc=first_6_accum, stype=anyarray,initcond='{}',finalfunc=sum_6);
> CREATE AGGREGATE
> regression=# select sum_first_6(i) from (select i from (select 1 as i
> union select 2 union select 3 union select 4 union select 5 union
> select 6 union select 7 union select 8) as x order by i desc) as x;
> sum_first_6
> -------------
> 33
> (1 row)
>
> regression=# select sum_first_6(i) from (select i from (select 1.1 as
> i union select 2 union select 3 union select 4 union select 5 union
> select 6 union select 7.7 union select 8) as x order by i desc) as x;
> sum_first_6
> -------------
> 33.7
> (1 row)
>
> regression=#
>
> regards, tom lane
>

An alternate solution I'm thinking is to add column to hold a
"total_months" value that could be used to simplify queries and speed
queries ( i.e. first month of oil productin = 1, second = 2 etc.) That
way I can use select the first 6 months by using "where < 6", or any
month interval for that matter.

The following query, suggested by another list member (thanks Josh
Berkus), to populate the "total_months" column sort of work but doesn't
handle the year wrapping as it adds 88 when the year wraps (see output
below).

UPDATE prd_data_test SET months_prod = prd_data_test."date" -
prd2."date" + 1
FROM prd_data_test prd2
WHERE prd_data_test.wid = prd2.wid
AND prd2."date" = ( SELECT "date" FROM prd_data_test prd3
WHERE prd3.wid = prd2.wid
ORDER BY "date" LIMIT 1 );

The results are:
SEM=# select * from prd_data_test order by wid, date limit 20;
date | hours | oil | gas | water | pwid | wid | year
| month_prd | months_prod
--------+-------+-------+------+-------+------+-----------------+------
+-----------+-------------
196505 | 480 | 194.3 | 10.3 | 0 | 1 | 01/1-6-1-30w1/0 | 1965
| | 1
196506 | 600 | 279.4 | 13.1 | 0 | 1 | 01/1-6-1-30w1/0 | 1965
| | 2
196507 | 744 | 288.1 | 4.5 | 0 | 1 | 01/1-6-1-30w1/0 | 1965
| | 3
196508 | 720 | 234.6 | 9.4 | 2.9 | 1 | 01/1-6-1-30w1/0 | 1965
| | 4
196509 | 648 | 208.2 | 12.5 | 6 | 1 | 01/1-6-1-30w1/0 | 1965
| | 5
196510 | 744 | 209.8 | 15.3 | 0 | 1 | 01/1-6-1-30w1/0 | 1965
| | 6
196511 | 720 | 180.5 | 13.9 | 27.7 | 1 | 01/1-6-1-30w1/0 | 1965
| | 7
196512 | 744 | 227.4 | 22.8 | 5.2 | 1 | 01/1-6-1-30w1/0 | 1965
| | 8
196601 | 744 | 230.3 | 22.7 | 10 | 1 | 01/1-6-1-30w1/0 | 1966
| | 97
196602 | 672 | 173.2 | 16.5 | 17 | 1 | 01/1-6-1-30w1/0 | 1966
| | 98
196603 | 744 | 197.2 | 18.7 | 9.2 | 1 | 01/1-6-1-30w1/0 | 1966
| | 99
196604 | 720 | 168.1 | 14.1 | 3 | 1 | 01/1-6-1-30w1/0 | 1966
| | 100

Table description:
Table "prd_data"
Column | Type | Modifiers
--------+-----------------------+-----------
date | integer |
hours | real |
oil | real |
gas | real |
water | real |
pwid | integer |
wid | character varying(20) |
year | smallint |
Indexes: wid_index6

Actual table (prd_data), 9 million records:

date | hours | oil | gas | water | pwid | wid | year
--------+-------+-------+------+-------+------+-----------------+------
196507 | 360 | 159.4 | 11.3 | 40.9 | 413 | 01/1-1-1-31w1/0 | 1965
196508 | 744 | 280 | 20 | 27.2 | 413 | 01/1-1-1-31w1/0 | 1965
196509 | 360 | 171.1 | 11.4 | 50.4 | 413 | 01/1-1-1-31w1/0 | 1965
196510 | 744 | 202.1 | 25 | 89.8 | 413 | 01/1-1-1-31w1/0 | 1965
196512 | 744 | 201.3 | 23.8 | 71.9 | 413 | 01/1-1-1-31w1/0 | 1965
196511 | 720 | 184 | 17.6 | 78.9 | 413 | 01/1-1-1-31w1/0 | 1965
196610 | 744 | 99.8 | 15.4 | 53.7 | 413 | 01/1-1-1-31w1/0 | 1966
196612 | 744 | 86 | 12.8 | 36.1 | 413 | 01/1-1-1-31w1/0 | 1966
196611 | 720 | 86 | 12.6 | 41.7 | 413 | 01/1-1-1-31w1/0 | 1966
196601 | 744 | 191.6 | 22.6 | 50.7 | 413 | 01/1-1-1-31w1/0 | 1966
200301 | 461 | 68.8 | 0 | 186.3 | 47899 | 9G/6-1-50-24w3/0 | 2003
200310 | 740 | 446.3 | 0 | 563.1 | 47899 | 9G/6-1-50-24w3/0 | 2003
200306 | 667 | 92.1 | 0 | 968.8 | 47899 | 9G/6-1-50-24w3/0 | 2003
200304 | 0 | 0 | 0 | 0 | 47899 | 9G/6-1-50-24w3/0 | 2003
200308 | 457 | 100.7 | 0 | 82.8 | 47899 | 9G/6-1-50-24w3/0 | 2003
200307 | 574 | 78 | 0 | 752 | 47899 | 9G/6-1-50-24w3/0 | 2003
200312 | 582 | 360.9 | 0 | 569 | 47899 | 9G/6-1-50-24w3/0 | 2003
200311 | 681 | 260.8 | 0 | 563.9 | 47899 | 9G/6-1-50-24w3/0 | 2003
200305 | 452 | 0 | 0 | 0 | 47899 | 9G/6-1-50-24w3/0 | 2003
200309 | 637 | 244.6 | 0 | 193.8 | 47899 | 9G/6-1-50-24w3/0 | 2003

Thanks,

--
Scott A. Gerhardt, P.Geo.
Gerhardt Information Technologies

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Joe Conway 2004-08-28 19:33:35 Re: [SQL] array_in: '{}}'::text[]
Previous Message Tom Lane 2004-08-28 16:19:37 Re: Aggregate query for multiple records