Re: Storing sensor data

From: Ivan Voras <ivoras(at)freebsd(dot)org>
To: Alexander Staubo <alex(at)bengler(dot)no>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Storing sensor data
Date: 2009-05-28 15:06:31
Message-ID: 9bbcef730905280806s6605d3bejd8d579be45c6f017@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

2009/5/28 Alexander Staubo <alex(at)bengler(dot)no>:
> On Thu, May 28, 2009 at 2:54 PM, Ivan Voras <ivoras(at)freebsd(dot)org> wrote:
>> The volume of sensor data is potentially huge, on the order of 500,000
>> updates per hour. Sensor data is few numeric(15,5) numbers.
>
> The size of that dataset, combined with the apparent simplicity of
> your schema and the apparent requirement for most-sequential access
> (I'm guessing about the latter two),

Your guesses are correct, except every now and then a random value
indexed on a timestamp needs to be retrieved.

> all lead me to suspect you would
> be happier with something other than a traditional relational
> database.
>
> I don't know how exact your historical data has to be. Could you get

No "lossy" compression is allowed. Exact data is needed for the whole dataset-

> If you require precise data with the ability to filter, aggregate and
> correlate over multiple dimensions, something like Hadoop -- or one of
> the Hadoop-based column database implementations, such as HBase or
> Hypertable -- might be a better option, combined with MapReduce/Pig to
> execute analysis jobs

This looks like an interesting idea to investigate. Do you have more
experience with such databases? How do they fare with the following
requirements:

* Storing large datasets (do they pack data well in the database? No
wasted space like in e.g. hash tables?)
* Retrieving specific random records based on a timestamp or record ID?
* Storing "inifinite" datasets (i.e. whose size is not known in
advance - cf. e.g. hash tables)

On the other hand, we could periodically transfer data from PostgreSQL
into a simpler database (e.g. BDB) for archival purposes (at the
expense of more code). Would they be better suited?

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Ivan Voras 2009-05-28 15:24:33 Re: Storing sensor data
Previous Message Kenneth Marshall 2009-05-28 15:01:02 Re: Storing sensor data