How to analyze load average ?

From: Condor <condor(at)stz-bg(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: How to analyze load average ?
Date: 2012-08-06 14:23:14
Message-ID: 99b972be0de3aff2f222f253ff79ce37@stz-bg.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

can some tell me, how I can analyze from where my server bring up load
average ?

I have one server with 128 GB memory, 32 CPU x86_64, RAID5 - 3 15k SAS
HDD ext4 fs. That is my produce server,
also is configured to send wal files over the net. Here is my
configuration:

max_connections = 500
shared_buffers = 32GB
work_mem = 192MB
maintenance_work_mem = 6GB
max_stack_depth = 6MB
bgwriter_delay = 200ms
bgwriter_lru_maxpages = 100
bgwriter_lru_multiplier = 2.0
wal_level = hot_standby
fsync = on
synchronous_commit = on
wal_sync_method = fdatasync
full_page_writes = on
wal_buffers = -1
checkpoint_segments = 32
checkpoint_timeout = 5min
checkpoint_completion_target = 0.5
max_wal_senders = 5
wal_sender_delay = 1s
wal_keep_segments = 64

enable_bitmapscan = on
enable_hashagg = on
enable_hashjoin = on
enable_indexscan = on
enable_material = on
enable_mergejoin = on
enable_nestloop = on
enable_seqscan = on
enable_sort = on
enable_tidscan = on

seq_page_cost = 1.0
random_page_cost = 2.0
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.005
cpu_operator_cost = 0.0025
effective_cache_size = 64GB

autovacuum = on

My on board raid cache write trough is OFF.

When I connect to server i see only 2 query with select * from
pg_stat_activity;
that is not complicated, select rid from table where id = 1;
Both tables have index on most frequently columns. When I check my
server load average is 0.88 0.94 0.87
Im trying to check from where that load avg is so high, only postgres
9.1.4 is working on that server.

Can some one point me from where I should start digging ? I think my
configuration about connections, shared buffers is right as I read
documentation,
I think this slow down can be because mu cache is on the raid card is
OFF. As I read on postgres wiki pages,
if I turn ON that setting on some fall I might lost some of my data,
well the company has UPS and I also have stream replicator so I won't
lose much data.

My iostat show:

avg-cpu: %user %nice %system %iowait %steal %idle
0.90 0.00 1.06 0.00 0.00 98.04

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 0.00 0.00 0.00 0 0

avg-cpu: %user %nice %system %iowait %steal %idle
1.92 0.00 1.06 0.00 0.00 97.02

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 0.00 0.00 0.00 0 0

And my vmstat:

procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy
id wa
0 0 0 99307408 334300 31144708 0 0 1 18 1 0
1 1 98 0
0 0 0 99303808 334300 31144716 0 0 0 0 926 715
0 0 99 0
0 0 0 99295232 334300 31144716 0 0 0 0 602 532
0 0 99 0
4 0 0 99268160 334300 31144716 0 0 0 32 975 767
2 2 96 0
1 0 0 99298544 334300 31144716 0 0 0 0 801 445
3 2 95 0
0 0 0 99311336 334300 31144716 0 0 0 0 320 175
1 0 98 0
2 0 0 99298920 334300 31144716 0 0 0 0 1195 996
1 1 97 0
0 0 0 99307184 334300 31144716 0 0 0 0 843 645
0 1 98 0
0 0 0 99301024 334300 31144716 0 0 0 12 1346 1040
2 2 96 0

Any one can tell me how I can find from where that load average is so
high ?

Thanks

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tomas Vondra 2012-08-06 14:38:33 Re: How to analyze load average ?
Previous Message François Beausoleil 2012-08-06 13:09:24 Re: Is it possible to create row-wise indexable condition for special case...