Re: Using Postgres to store high volume streams of sensor readings

From: Michal Szymanski <dyrex(at)poczta(dot)onet(dot)pl>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Using Postgres to store high volume streams of sensor readings
Date: 2008-11-21 13:12:47
Message-ID: ad905c0c-d091-4dd5-99e6-daaa2dfb598b@t2g2000yqm.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 21 Lis, 13:50, ciprian(dot)crac(dot)(dot)(dot)(at)gmail(dot)com ("Ciprian Dorin Craciun")
wrote:
>     Hello all!
>
>     I would like to ask some advice about the following problem
> (related to the Dehems project:http://www.dehems.eu/):
>     * there are some clients; (the clients are in fact house holds;)
>     * each device has a number of sensors (about 10), and not all the
> clients have the same sensor; also sensors might appear and disappear
> dynamicaly; (the sensors are appliances;)
>     * for each device and each sensor a reading is produced (at about
> 6 seconds); (the values could be power consumptions;)
>     * I would like to store the following data: (client, sensor,
> timestamp, value);
>     * the usual queries are:
>         * for a given client (and sensor), and time interval, I need
> the min, max, and avg of the values;
>         * for a given time interval (and sensor), I need min, max, and
> avg of the values;
>         * other statistics;

How many devices you expect ?
As I understand number of expected is more or less:
no.of devices * no.sensors (about 10)
every 6second. Let assume that you have 100 devices it means 1000
inserts per 6s = 166 insert for 1 seconds.

>     * inserts are done like this:
>         * generated 100 million readings by using the following rule:
>             * client is randomly chosen between 0 and 10 thousand;
>             * sensor is randomly chosen between 0 and 10;
>             * the timestamp is always increasing by one;
>         * the insert is done in batches of 500 thousand inserts (I've
> also tried 5, 25, 50 and 100 thousand without big impact);
>         * the banch inserts are done through COPY sds_benchmark_data
> FROM STDIN through libpq (by using UNIX (local) sockets);

>     What have I observed / tried:
>     * I've tested without the primary key and the index, and the
> results were the best for inserts (600k inserts / s), but the
> readings, worked extremly slow (due to the lack of indexing);
>     * with only the index (or only the primary key) the insert rate is
> good at start (for the first 2 million readings), but then drops to
> about 200 inserts / s;

Try periodicaly execute REINDEX your index, and execute ANALYZE for
your table . To be honest should not influance on inserts but will
influance on select.

Michal Szymanski
http://blog.szymanskich.net

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Grzegorz Jaśkiewicz 2008-11-21 13:18:30 Re: Using Postgres to store high volume streams of sensor readings
Previous Message Pavel Stehule 2008-11-21 13:05:02 Re: converter pgplsql funcion