Re: Question regarding modelling for time series

From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: Alex Grund <st(dot)helldiver(at)googlemail(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Question regarding modelling for time series
Date: 2012-09-05 05:36:59
Message-ID: CAL_0b1uFEuKq1r_PV3wHEdhgj=2_oVu+CdCMX=FqKwQKe=R5rQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Wed, Sep 5, 2012 at 12:16 AM, Alex Grund <st(dot)helldiver(at)googlemail(dot)com> wrote:
> So, I thought of a relational data base model like that:

It is worth to make like this

TABLE 'ts' (TimeSeries)
PK:id | name

TABLE 'r' (Releases)
PK:id | FK:ts_id | release_date | reporting_date | value

It is a little more redundant but easier to work with data.

> 1) "get me the time series [reporting_date, value] of unemployment as
> it is now seen", so give all reporting_date,value tuples with the most
> recent release_date.

SELECT * FROM r
WHERE
ts_id = 123 AND
release_date = (SELECT max(release_date) FROM r)
ORDER BY reporting_date;

> 2) "get me the time series [reporting_date, value] as it was
> published/known to the market", so that means, in this case, give this
> list:
> Unemployment US | 2011/01/01 | 2010/12/01 | xxx
> Unemployment US | 2011/02/01 | 2011/01/01 | xxx
> Unemployment US | 2011/03/01 | 2011/02/01 | xxx

If I understand it correct it will look like

SELECT DISTINCT ON (release_date) * FROM r
WHERE ts_id = 123
ORDER BY release_date, reporting_date DESC;

> 3) the same as (1) but with one enhancement: if the most recent
> release has a history of N month, but all releases has a history of
> N+X month, the time series from the most recent release should be
> delivered plus the older values (in terms of reporting_dates) from the
> second most recent release plus the more older values from the third
> most recent release and so on.

I am not quite understand what is meant here. Could you please provide
more explanation and some examples.

> And: If the datasets grow further, will be an RDBMS the right model
> for time series storage? Any ideas on what else I could use?

It will be. You will probably need some more optimization/partitioning
depending on the data distribution however it can be done later.

>
>
>
> Thank you very much!
>
>
> --Alex
>
>
> --
> Sent via pgsql-sql mailing list (pgsql-sql(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql

--
Sergey Konoplev

a database and software architect
http://www.linkedin.com/in/grayhemp

Jabber: gray(dot)ru(at)gmail(dot)com Skype: gray-hemp Phone: +79160686204

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Alex Grund 2012-09-05 07:39:20 Re: Question regarding modelling for time series
Previous Message Alex Grund 2012-09-04 20:16:05 Question regarding modelling for time series