Re: best db schema for time series data?

From: Harald Fuchs <hari(dot)fuchs(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: best db schema for time series data?
Date: 2010-11-16 16:28:19
Message-ID: 86zkt9qllo.fsf@mgm.protecting.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

In article <4CE2688B(dot)2050000(at)tweakers(dot)net>,
Arjen van der Meijden <acmmailing(at)tweakers(dot)net> writes:

> On 16-11-2010 11:50, Louis-David Mitterrand wrote:
>> I have to collect lots of prices from web sites and keep track of their
>> changes. What is the best option?
>>
>> 1) one 'price' row per price change:
>>
>> create table price (
>> id_price primary key,
>> id_product integer references product,
>> price integer
>> );
>>
>> 2) a single 'price' row containing all the changes:
>>
>> create table price (
>> id_price primary key,
>> id_product integer references product,
>> price integer[] -- prices are 'pushed' on this array as they change
>> );
>>
>> Which is bound to give the best performance, knowing I will often need
>> to access the latest and next-to-latest prices?

> If you mostly need the last few prices, I'd definitaly go with the
> first aproach, its much cleaner. Besides, you can store a date/time
> per price, so you know when it changed. With the array-approach that's
> a bit harder to do.

I'd probably use a variant of this:

CREATE TABLE prices (
pid int NOT NULL REFERENCES products,
validTil timestamp(0) NULL,
price int NOT NULL,
UNIQUE (pid, validTil)
);

The current price of a product is always the row with validTil IS NULL.
The lookup should be pretty fast because it can use the index of the
UNIQUE constraint.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Chris Browne 2010-11-16 16:35:24 Re: best db schema for time series data?
Previous Message Alvaro Herrera 2010-11-16 15:26:44 Re: autovacuum blocks the operations of other manual vacuum