Re: Large number of tables slow insert

From: "Loic Petit" <tls(dot)wydd(at)free(dot)fr>
To: peter(dot)schuller(at)infidyne(dot)com
Cc: pgsql-performance(at)postgresql(dot)org, hhall1001(at)reedyriver(dot)com, scott(at)richrelevance(dot)com, dforums(at)vieonet(dot)com, max(at)nts(dot)biz(dot)ua
Subject: Re: Large number of tables slow insert
Date: 2008-08-24 22:30:44
Message-ID: 1775c5ea0808241530o23baf46fo21bc1c9c16f39563@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Quite a lot of answers !

> Does all INSERTs are made by one application? Maybe PREPARED STATEMENTS
will help you? What about optimization on application level?
Yes it's on only one application (through JDBC), optimizations (grouped
transaction, prepared statement) will be done but that our very first test
in hard condition which scared us all :p.

> Can you tell us what kind of application this is? It sounds like a control
systems application where you will write the current values of the sensors
with each scan of a PLC. If so, is that a good idea? Also is 3,000 sensors
realistic? That would be a lot of sensors for one control system.
Our research project is trying to manage large scale sensor network
deployments. 3.000 is quite a huge deployment but it can be realistic for
huge aggricultural deployment for example.

> That would be an insert plus updates to each of your 6 indexes every 0.33
ms. Is that a good idea? Is there a better strategy? What are you measuring
with the instruments e.g. is this a process plant or manufacturing facility?
What will people do with this data?
I try to suppress the indexes the more I can. Actually I only really need
the index on timestamp to see for example the last readings, and to search
for a historical data by period, the others (values) are more for "when this
sensor was over 45ºC" for instance but it can be without indexes (will be
slow but less heavy at insert time). I get the data from differents telosb
motes that gathers temperature / humidity and light.

> Have you checked what you are bottlenecking on - CPU or disk? Try
iostat/top/etc during the inserts. Also check actual disk utilizatio (iostat
-x on linux/freebsd; varies on others) to see what percentage of time the
disk/storage device is busy.
I saw the results of iostat and top, the postgres process was at 70% cpu .
Yes I know that my test machine is not brand new but I have to find a good
solution with this.

Ok I just ran some tests. It seems that I spammed too much right after the
creation of the tables, thus the vacuum analyse could not be ran. I have
better results now :

Average of writing 10 rows in each table
ON 1000 TABLES
Without indexes at all : ~1.5s
With only the index on timestamp : ~2.5s
With all indexes : ~30s

ON 3000 TABLES
Without indexes at all : ~8s
With only the index on timestamp : ~45s
With all indexes : ~3min

I don't know why but the difference is quite huge with indexes ! When I did
my vacuum the system told me about the "max_fsm_relations" (1000). Do you
think it will change something (as Scott said). I didn't have time to run
tests with vacuum analyze on system table see you tomorow for other news...

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Craig Ringer 2008-08-25 00:30:18 Re: Identifying the nature of blocking I/O
Previous Message Peter Schuller 2008-08-24 21:29:53 Re: Large number of tables slow insert