Re: time series query

From: William Garrison <postgres(at)mobydisk(dot)com>
To:
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: time series query
Date: 2007-04-02 13:50:34
Message-ID: 46110A2A.5040901@mobydisk.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Would it speed things up siginficantly if you set the dtval_smaller()
function to be immutable? Volatile is the default, so it may be
redundantly evaluating things.

Jaime Silvela wrote:
> In case anyone is interested, I was able to solve this, more or less.
> Here's my new "Latest value" query:
>
> select obj_id, val_type_id, (max(row(observation_date, val))).val
> from measurements
> group by obj_id, val_type_id
>
> It was only necessary to define a new (date, numeric) type. Below is the
> code. The performance is actually slower than using a JOIN between the
> table and its GROUP-BY version. I guess for performance, I should code
> the functions in C, but at the moment, the value for me is that it
> simplifies a lot of my 12-way join queries!
>
> create type dtval as (
> dt date,
> val numeric
> );
>
> create function dtval_smaller(dtval, dtval) returns dtval as $$
> select case when $1.dt < $2.dt then $1 else $2 end
> $$ language sql;
>
> create aggregate min (
> sfunc = dtval_smaller,
> basetype = dtval,
> stype = dtval
> );
>
> create function dtval_larger(dtval, dtval) returns dtval as $$
> select case when $1.dt > $2.dt then $1 else $2 end
> $$ language sql;
>
> create aggregate max (
> sfunc = dtval_larger,
> basetype = dtval,
> stype = dtval
> );
>
>
>
> Jaime Silvela wrote:
>> The problem I'm trying to solve is pretty standard. I have a table
>> that records measurements of different types at different times.
>>
>> CREATE TABLE measurements (
>> obj_id int4,
>> val_type_id int4 references lu_val_type(val_type_id),
>> val numeric,
>> observation_date date
>> );
>>
>> I want a query as simple and fast as possible to return the latest
>> observation of each type for each object.
>> I sent a message to this list a while ago, and the suggestion I found
>> to be the best compromise of clarity and speed was:
>> a) create an index on (obj_id, val_type_id, observation_date)
>> b) the "obvious" query becomes fast thanks to the index.
>> select ms.*
>> from (
>> select obj_id, val_type_id, max(observation_date) as
>> observation_date
>> from measurements
>> group by obj_id, val_type_id
>> ) ms_last
>> join measurements ms using (obj_id, val_type_id, observation_date);
>>
>> It still bugged me a bit that this requires a JOIN, especially since
>> in a procedural language, it would have been so easy to return the
>> value associated with the max(observation_date).
>> I think I've found a pretty good alternative. This at the moment works
>> if we keep track of time with an integer, rather than a date, but it
>> would be readily extensible.
>>
>> The idea is to in fact, associate the value with the
>> max(observation_date) like so:
>> select obj_id, val_type_id, max(array[observation_date, val])
>> group by obj_id, val_type_id;
>>
>> There are two caveats:
>> a) array requires elements to be of the same type, so observation_type
>> must be kept as "time from"
>> b) a row constructor would be ideal here, but there is now max
>> function for rowtypes.
>>
>> If I did have a max() function for row types, it would be clean to do
>> this:
>> select obj_id, val_type_id, max(row(observation_date, val))
>> group by obj_id, val_type_id;
>>
>> Now, it seems that since rowtype comparison is built in, it should be
>> pretty easy to build a max() aggregate for it. Has anybody done this?
>> I'd have looked at the code for max(anyarray) but I don't know how to
>> access it. Can someone point me in the right direction?
>>
>> Also, has someone thought about this before? I'm wondering if there
>> will be a speed gain coming from this.
>>
>> Thank you,
>> Jaime
>>
>>
>> ***********************************************************************
>> Bear Stearns is not responsible for any recommendation, solicitation,
>> offer or agreement or any information about any transaction, customer
>> account or account activity contained in this communication.
>>
>> Bear Stearns does not provide tax, legal or accounting advice. You
>> should consult your own tax, legal and accounting advisors before
>> engaging in any transaction. In order for Bear Stearns to comply with
>> Internal Revenue Service Circular 230 (if applicable), you are notified
>> that any discussion of U.S. federal tax issues contained or referred to
>> herein is not intended or written to be used, and cannot be used, for
>> the purpose of: (A) avoiding penalties that may be imposed under the
>> Internal Revenue Code; nor (B) promoting, marketing or recommending to
>> another party any transaction or matter addressed herein.
>> ***********************************************************************
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 3: Have you checked our extensive FAQ?
>>
>> http://www.postgresql.org/docs/faq
>>
>
>
> ***********************************************************************
> Bear Stearns is not responsible for any recommendation, solicitation,
> offer or agreement or any information about any transaction, customer
> account or account activity contained in this communication.
>
> Bear Stearns does not provide tax, legal or accounting advice. You
> should consult your own tax, legal and accounting advisors before
> engaging in any transaction. In order for Bear Stearns to comply with
> Internal Revenue Service Circular 230 (if applicable), you are notified
> that any discussion of U.S. federal tax issues contained or referred to
> herein is not intended or written to be used, and cannot be used, for
> the purpose of: (A) avoiding penalties that may be imposed under the
> Internal Revenue Code; nor (B) promoting, marketing or recommending to
> another party any transaction or matter addressed herein.
> ***********************************************************************
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org/
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alban Hertroys 2007-04-02 14:21:08 Re: calling a stored procedure using sql query in 7.4
Previous Message Teodor Sigaev 2007-04-02 13:45:55 Re: to_tsvector in 8.2.3