Skip site navigation (1) Skip section navigation (2)

Re: Postgresql performace question

From: Rod Taylor <rbt(at)rbt(dot)ca>
To: Mark Jones <mlist(at)hackerjones(dot)org>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>,pgsql-general(at)postgresql(dot)org
Subject: Re: Postgresql performace question
Date: 2003-03-03 00:35:45
Message-ID: 1046651745.26763.241.camel@jester (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-hackers
On Sun, 2003-03-02 at 18:52, Mark Jones wrote:
> Hello
> 
> I am working on a project that acquires real-time data from an external
> device that I need to store and be able to search through and retrieve
> quickly. My application receives packets of data ranging in size from 300 to
> 5000 bytes every 50 milliseconds for the minimum duration of 24 hours before
> the data is purged or archived off disk. There are several fields in the
> data that I like to be able to search on to retrieve the data at later time.
> By using a SQL database such as Postgresql or Mysql it seams that it would
> make this task much easier. My questions are, is a SQL database such as
> Postgresql able to handle this kind of activity saving a record of 5000
> bytes at rate of 20 times a second, also how well will it perform at
> searching through a database which contains nearly two million records at a
> size of about 8 - 9 gigabytes of data, assuming that I have adequate
> computing hardware. I am trying to determine if a SQL database would work
> well for this or if I need to write my own custom database for this project.
> If anyone has any experience in doing anything similar with Postgresql  I
> would love to know about your findings.

Not such a big load depending on the complexity of the queries being
performed.  From the sounds of it, they're pretty simple.

The real question is, the data collection is in real-time, but can you
have a maintenance window (6 hours a week?) to do things like REINDEX?

Can you use a parallel loader or does it need to be loaded sequentially?

Will the data purge be spread over time or come in large lumps?


Simple queries on 2 million records happen in low double digit
milliseconds on relatively low end hardware with a decent index -- but
complex queries can take minutes on 1/10000th the dataset.

20 inserts / second shouldn't be difficult to achieve even on the lowest
end hardware -- my laptop can do that -- giving each packet it's own
transaction.

But if it's 20 packets with parts being inserted into 50 tables (now
1000 inserts / second) things could become a little trickier -- but
still easily doable.

The most important question is the maintenance window. 

-- 
Rod Taylor <rbt(at)rbt(dot)ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

In response to

Responses

pgsql-hackers by date

Next:From: Mark JonesDate: 2003-03-03 01:33:55
Subject: Re: Postgresql performace question
Previous:From: Mark JonesDate: 2003-03-02 23:52:37
Subject: Postgresql performace question

pgsql-general by date

Next:From: Bruno Wolff IIIDate: 2003-03-03 01:26:15
Subject: Re: transparent use of pre-aggregated value for very large datasets
Previous:From: Doug McNaughtDate: 2003-03-03 00:25:46
Subject: Re: Hosting a data file on a SAN

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group