Re: high user cpu, massive SELECTs, no io waiting problem

From: Thomas Pöhler <tp(at)turtle-entertainment(dot)de>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: high user cpu, massive SELECTs, no io waiting problem
Date: 2011-02-16 17:11:45
Message-ID: F1CD4AC5B4A5024AAB60E28A82A8450A03CEB2DA@winexch1.office.turtle-entertainment.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

we are using two instances of pgbouncer v1.4 for connection pooling.
One for prepared statements (pool_mode session) and one without (pool_mode transaction).

Pgbouncer.ini:
[pgbouncer]
pool_mode = transaction/session
server_reset_query = DISCARD ALL;
server_check_query = select 1
server_check_delay = 10
max_client_conn = 10000
default_pool_size = 450
log_connections = 0
log_disconnections = 0
log_pooler_errors = 1
client_login_timeout = 0

I will examine htop next time during a peak.

If I remember correctly vmstat showed lots of context switches during a peak above 50k.

We are running a biweekly downtime where we do a complete reindex and vaccum full. We cannot identify certain queries causing this.

The last graph in ganglia (http://dl.dropbox.com/u/183323/CPUloadprobsdb1.jpg) shows the avg_queries from pgbouncers stats. I think this is a symptom of many waiting queries which accumulate.

Our iscsi is connected with 3Gibt/s. But that's more than enough. We don't have high traffic throughput.

This is the result of the query you gave me:

version PostgreSQL 8.4.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (Debian 4.3.2-1.1) 4.3.2, 64-bit
checkpoint_segments 40
custom_variable_classes pg_stat_statements
effective_cache_size 48335MB
escape_string_warning off
fsync on
lc_collate C
lc_ctype C
listen_addresses *
log_destination stderr
log_line_prefix %t %p %d %u %r
log_lock_waits on
log_min_duration_statement 1s
log_min_messages notice
log_rotation_size 10MB
log_temp_files 50MB
logging_collector on
maintenance_work_mem 1GB
max_connections 1000
max_prepared_transactions 5
max_stack_depth 2MB
pg_stat_statements.max 10000
pg_stat_statements.track all
port 5433
server_encoding UTF8
shared_buffers 16GB
TimeZone Europe/Berlin
update_process_title on
wal_buffers 1MB
work_mem 32MB

Seems like connection limit 10000 is way too much on pgbouncer? Our queries overall are not that CPU intensive. If they are slow, they are mostly waiting for disk io. When having a look at the traffic of this database server we see 2/3 of the traffic is going to san/disk and only 1/3 going to the server. In other words from the traffic view, 2/3 of our operations are writes and 1/3 are reads. The database is fitting completely into ram, so reads should not be a problem.

Appreciate your help!
Thomas

-----Ursprüngliche Nachricht-----
Von: Kevin Grittner [mailto:Kevin(dot)Grittner(at)wicourts(dot)gov]
Gesendet: Mittwoch, 16. Februar 2011 17:09
An: Greg Smith; Justin Pitts
Cc: pgsql-performance(at)postgresql(dot)org; Verteiler_A-Team; Björn Metzdorf; Felix Feinhals; Thomas Pöhler
Betreff: Re: [PERFORM] high user cpu, massive SELECTs, no io waiting problem

Justin Pitts <justinpitts(at)gmail(dot)com> wrote:
> I think adding
>
> UNION ALL SELECT 'postgres version', version();
>
> might be a good thing.

Good point. Added.

> Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>> Kevin Grittner wrote:
>>>
>>> In fact, I wonder whether we shouldn't leave a couple items
>>> you've excluded, since they are sometimes germane to problems
>>> posted, like lc_collate and TimeZone.
>>
>> I pulled some of them out only because they're not really
>> postgresql.conf settings; lc_collate and lc_ctype for example are
>> set at initdb time. Feel free to hack on that example if you
>> feel it could be improved, just be aware which of those things
>> are not really in the main config file when pondering if they
>> should be included.

Basically, the ones I could remember us needing to ask about on
multiple occasions, I put back -- provisionally. If someone thinks
they're pointless, I won't worry about them being dropped again:
time zone, character encoding scheme, character set, and collation.
I'm pretty sure I've seen us ask about all of those in trying to
sort out a problem.

I also tried the query on a newly installed HEAD build which had no
manual changes to the postgresql.conf file and found a few others
which seemed to me to be worth suppressing.

I took my shot -- anyone else is welcome to do so.... :-)

-Kevin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Cédric Villemain 2011-02-16 17:58:23 Re: high user cpu, massive SELECTs, no io waiting problem
Previous Message Ross J. Reedstrom 2011-02-16 16:28:32 Re: Really really slow select count(*)