Re: Using Postgres to store high volume streams of sensor readings

From: "Ciprian Dorin Craciun" <ciprian(dot)craciun(at)gmail(dot)com>
To: "Gerhard Heift" <ml-postgresql-20081012-3518(at)gheift(dot)de>, pgsql-general(at)postgresql(dot)org
Subject: Re: Using Postgres to store high volume streams of sensor readings
Date: 2008-11-21 13:03:42
Message-ID: 8e04b5820811210503t71061242s7217f980da1688@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Nov 21, 2008 at 2:55 PM, Gerhard Heift
<ml-postgresql-20081012-3518(at)gheift(dot)de> wrote:
> On Fri, Nov 21, 2008 at 02:50:45PM +0200, Ciprian Dorin Craciun wrote:
>> Hello all!
>>
>> I would like to ask some advice about the following problem
>> (related to the Dehems project: http://www.dehems.eu/ ):
>> * there are some clients; (the clients are in fact house holds;)
>> * each device has a number of sensors (about 10), and not all the
>> clients have the same sensor; also sensors might appear and disappear
>> dynamicaly; (the sensors are appliances;)
>> * for each device and each sensor a reading is produced (at about
>> 6 seconds); (the values could be power consumptions;)
>> * I would like to store the following data: (client, sensor,
>> timestamp, value);
>> * the usual queries are:
>> * for a given client (and sensor), and time interval, I need
>> the min, max, and avg of the values;
>> * for a given time interval (and sensor), I need min, max, and
>> avg of the values;
>> * other statistics;
>>
>> Currently I'm benchmarking the following storage solutions for this:
>> * Hypertable (http://www.hypertable.org/) -- which has good insert
>> rate (about 250k inserts / s), but slow read rate (about 150k reads /
>> s); (the aggregates are manually computed, as Hypertable does not
>> support other queries except scanning (in fact min, and max are easy
>> beeing the first / last key in the ordered set, but avg must be done
>> by sequential scan);)
>> * BerkeleyDB -- quite Ok insert rate (about 50k inserts / s), but
>> fabulos read rate (about 2M reads / s); (the same issue with
>> aggregates;)
>> * Postgres -- which behaves quite poorly (see below)...
>> * MySQL -- next to be tested;
>
> For this case I would use RRDTool: http://oss.oetiker.ch/rrdtool/
>
> Regards,
> Gerhard
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iEYEARECAAYFAkkmr9wACgkQa8fhU24j2flejQCeKl650qmSlomovOXDl6IKrADb
> cTAAnRebAFq420MuW9aMmhoFOo+sPIje
> =Zcoo
> -----END PGP SIGNATURE-----

Hy Gerhard, I know about RRDTool, but it has some limitations:
* I must know in advance the number of sensors;
* I must create for each client a file (and If I have 10 thousand clients?);
* I have a limited amount of history;
* (I'm not sure about this one but i think that) I must insert
each data point by executing a command;
* and also I can not replicate (distribute) it easily;

Or have you used RRDTool in a similar context as mine? Do you have
some benchmarks?

Ciprian.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Pavel Stehule 2008-11-21 13:05:02 Re: converter pgplsql funcion
Previous Message Gerhard Heift 2008-11-21 12:55:56 Re: Using Postgres to store high volume streams of sensor readings