Re: Performance problems with large telemetric datasets on 7.4.2

From: "Sven Clement" <sven(at)dsign(dot)lu>
To: "Mark Lewis" <mark(dot)lewis(at)mir3(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance problems with large telemetric datasets on 7.4.2
Date: 2007-08-03 15:57:11
Message-ID: 881035ea0708030857s621af020r4d3d7b994e2ca9a3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

First thank you already for your answers, as we are working in an
environment with NDA's I have first to check all the queries before I may
publish them here, but the structure of the DB is publishable:

2 Tables:
Table: "public.tmdata"

Column | Type | Modifiers
------------+-----------------------------+-----------------------------
timestamp | timestamp without time zone |
id | integer | default -2147483684::bigint
datapointid | integer | default 0
value | integer | default 0

Indexes:
"tmdata_idx1" btree ("timestamp")

Legend:
-------
timestamp = Timeindex of the event
id = Hostname of the system who sent the event
datapointid = ID of the Datapoint ( less than 100 )
value = The value of the event

========================================================================

Table: "public.tmdataintervalsec"

Column | Type | Modifiers
------------+-----------------------------+-----------------------------
timestamp | timestamp without time zone |
id | integer | default -2147483684::bigint
datapointid | integer | default 0
max | integer | default 0
min | integer | default 0
avg | integer | default 0
count | integer | default 0

Indexes:
"tmdataintervalsec_idx1" btree ("timestamp", id)

Legend:
-------
timestamp = Sets the period
id = Hostname of the system who sent the event
datapointid = ID of the Datapoint ( less than 100 )
max = Max value for the period
min = Min value for the period
avg = Average of all values for the period
count = Number of rows used for generation of the statistic

The data for the second table is generated by the daemon who receives the
data and writes it to the database.

And we also confirmed that the index is used by the queries.

Regards,
Sven

P.S: I hope the databse layout is tsill readable when you receive it... ;)

2007/8/3, Mark Lewis <mark(dot)lewis(at)mir3(dot)com>:
>
> On Fri, 2007-08-03 at 06:52 -0700, Sven Clement wrote:
> > Hello everybody,
> >
> > as I'm new to this list I hope that it is the right place to post this
> > and also the right format, so if I'm committing an error, I apologize
> > in advance.
> >
> > First the background of my request:
> >
> > I'm currently employed by an enterprise which has approx. 250 systems
> > distributed worldwide which are sending telemetric data to the main
> > PostgreSQL.
> > The remote systems are generating about 10 events per second per
> > system which accumulates to about 2500/tps.
> > The data is stored for about a month before it is exported and finally
> > deleted from the database.
> > On the PostgreSQL server are running to databases one with little
> > traffic (about 750K per day) and the telemetric database with heavy
> > write operations all around the day (over 20 million per day).
> > We already found that the VACUUM process takes excessively long and as
> > consequence the database is Vacuumed permanently.
> >
> > The hardware is a IBM X306m Server, 3.2 GHz HT (Pentium IV), 1 GB RAM
> > and 2x 250 GB HDD (SATA-II) with ext3 fs, one of the HDD is dedicated
> > to database.
> > OS is Debian 3.1 Sarge with PostgreSQL 7.4.7 (7.4.7-6sarge1) with the
> > libpq frontend library.
> >
> > Now the problem:
> >
> > The problem we are experiencing is that our queries are slowing down
> > continuously even if we are performing queries on the index which is
> > the timestamp of the event, a simple SELECT query with only a simple
> > WHERE clause (< or >) takes very long to complete. So the database
> > becomes unusable for production use as the data has to be retrieved
> > very quickly if we want to act based on the telemetric data.
> >
> > So I'm asking me if it is useful to update to the actual 8.2 version
> > and if we could experience performance improvement only by updating.
> >
> > Thank you for your answers,
> > Sven Clement
>
> Upgrading from 7.4.x to 8.2.x will probably give you a performance
> benefit, yes. There have been numerous changes since the days of 7.4.
>
> But you didn't really give any information about why the query is
> running slow. Specifically, could you provide the query itself, some
> information about the tables/indexes/foreign keys involved, and an
> EXPLAIN ANALYZE for one of the problematic queries?
>
> Also, what kind of vacuuming regimen are you using? Just a daily cron
> maybe? Are you regularly analyzing the tables?
>
> -- Mark Lewis
>

--
DSIGN.LU
Sven Clement
+352 621 63 21 18
sven(at)dsign(dot)lu

www.dsign.lu

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2007-08-03 17:56:23 Re: Performance problems with large telemetric datasets on 7.4.2
Previous Message Mark Lewis 2007-08-03 14:20:36 Re: Performance problems with large telemetric datasets on 7.4.2