Sudden drop in DBb performance

From: Gerhard Wohlgenannt <wohlg(at)ai(dot)wu(dot)ac(dot)at>
To: pgsql-performance(at)postgresql(dot)org
Cc: Heinz-Peter Lang <heinz(at)langatium(dot)net>, Gerhard Wohlgenannt <wohlg(at)ai(dot)wu-wien(dot)ac(dot)at>, "Weichselbraun, Albert" <Albert(dot)Weichselbraun(at)wu(dot)ac(dot)at>
Subject: Sudden drop in DBb performance
Date: 2011-09-03 07:26:44
Message-ID: 4E61D6B4.9090804@ai.wu.ac.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Dear list,

we are encountering serious performance problems with our database.
Queries which took around 100ms or less last week now take several seconds.

The database runs on Ubuntu Server 10.4.3 (kernel: 2.6.32-33) on
hardware as follows:
8-core Intel Xeon CPU with 2.83GHz
48 GB RAM
RAID 5 with 8 SAS disks
PostgreSQL 8.4.8 (installed from the Ubuntu repository).

Additionally to the DB the machine also hosts a few virtual machines. In
the past everything worked very well and the described problem occurs
just out of the blue. We don't know of any postgresql config changes or
anything else which might explain the performance reduction.
We have a number of DBs running in the cluster, and the problem seems to
affect all of them.

We checked the performance of the RAID .. which is reasonable for eg.
"hdparm -tT". Memory is well used, but not swapping.
vmstat shows, that the machine isn't using the swap and the load
shouldn't be also to high:
root(at)host:~# vmstat
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs us
sy id wa
0 0 0 308024 884812 40512932 0 0 464 168 353 92
4 2 84 9

Bonnie++ results given below, I am no expert at interpreting those :-)

Activating log_min_duration shows for instance this query --- there are
now constantly queries which take absurdely long.

2011-09-02 22:38:18 CEST LOG: Dauer: 25520.374 ms Anweisung: SELECT
keyword_id FROM keywords.table_x WHERE keyword=E'diplomaten'

db=# explain analyze SELECT keyword_id FROM keywords.table_x WHERE
keyword=E'diplomaten';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_table_x_keyword on table_x (cost=0.00..8.29
rows=1 width=4) (actual time=0.039..0.041 rows=1 loops=1)
Index Cond: ((keyword)::text = 'diplomaten'::text)
Total runtime: 0.087 ms
(3 Zeilen)

db=# \d keywords.table_x
Tabelle »keywords.table_x«
Spalte | Typ
| Attribute
------------+-------------------+------------------------------------------------------------------------------------------------------
keyword_id | integer | not null Vorgabewert
nextval('keywords.table_x_keyword_id_seq'::regclass)
keyword | character varying |
so | double precision |
Indexe:
"table_x_pkey" PRIMARY KEY, btree (keyword_id) CLUSTER
"idx_table_x_keyword" btree (keyword)
Fremdschlüsselverweise von:
TABLE "keywords.table_x_has" CONSTRAINT
"table_x_has_keyword_id_fkey" FOREIGN KEY (keyword_id) REFERENCES
keywords.table_x(keyword_id) ON UPDATE CASCADE ON DELETE CASCADE

Could you be so kind and give us any advice how to track down the
problem or comment on possible reasons???

Thank you very much in advance!!!

Regards,
heinz + gerhard

name
| current_setting
----------------------------+-------------------------------------------------------------------------------------------------------------
version | PostgreSQL 8.4.8 on x86_64-pc-linux-gnu,
compiled by GCC gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit
archive_command | /usr/local/sbin/weblyzard-wal-archiver.sh
%p %f
archive_mode | on
checkpoint_segments | 192
effective_cache_size | 25000MB
external_pid_file | /var/run/postgresql/8.4-main.pid
full_page_writes | on
geqo | on
lc_collate | de_AT.UTF-8
lc_ctype | de_AT.UTF-8
listen_addresses | *
log_line_prefix | %t
log_min_duration_statement | 3s
maintenance_work_mem | 500MB
max_connections | 250
max_stack_depth | 2MB
port | 5432
server_encoding | UTF8
shared_buffers | 7000MB
ssl | on
TimeZone | localtime
unix_socket_directory | /var/run/postgresql
work_mem | 256MB

Results of Bonnie++

Version 1.96 ------Sequential Output------ --Sequential Input-
--Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP
/sec %CP
voyager 95G 1400 93 27804 3 16324 2 2925 96 41636 3
374.9 4
Latency 7576us 233s 164s 15647us 13120ms
3302ms
Version 1.96 ------Sequential Create------ --------Random
Create--------
voyager -Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
/sec %CP
16 141 0 +++++ +++ 146 0 157 0 +++++ +++
172 0
Latency 1020ms 128us 9148ms 598ms 37us
485ms
1.96,1.96,voyager,1,1314988752,95G,,1400,93,27804,3,16324,2,2925,96,41636,3,374.9,4,16,,,,,141,0,+++++,+++,146,0,157,0,+++++,+++,172,0,7576us,233s,164s,15647us,13120ms,3302ms,1020ms,128us,9148ms,598ms,37us,485ms

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Stefan Keller 2011-09-03 08:56:14 Re: Summaries on SSD usage?
Previous Message Jesper Krogh 2011-09-03 06:49:27 Re: Summaries on SSD usage?