Re: High CPU load on Postgres Server during Peak times!!!!

From: Praveen DS <praveen(dot)ds(at)gmail(dot)com>
To: Shiva Raman <raman(dot)shivag(at)gmail(dot)com>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-performance(at)postgresql(dot)org
Subject: Re: High CPU load on Postgres Server during Peak times!!!!
Date: 2009-09-24 13:36:01
Message-ID: 8e8d29970909240636s71a03769i27b0921d976183e3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

For 'idle in transaction' issues, you have to fix your code. I faced this
issue couple of months back. How good is your exception handling? Are you
rollingback/comitting your transactions while exceptions are thrown, during
the course of db operations?

Honestly I wouldn't go for these scripts which kill processes.

On Thu, Sep 24, 2009 at 6:20 PM, Shiva Raman <raman(dot)shivag(at)gmail(dot)com> wrote:

> Hi
>
> Today the load observed very high load . I am pasting the top.
>
> *TOP *
> top - 12:45:23 up 79 days, 14:42, 1 user, load average: 45.84, 33.13,
> 25.84
> Tasks: 394 total, 48 running, 346 sleeping, 0 stopped, 0 zombie
> Cpu(s): 49.2%us, 0.8%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.1%si,
> 50.0%st
> Mem: 16133676k total, 14870736k used, 1262940k free, 475484k buffers
> Swap: 14466492k total, 124k used, 14466368k free, 11423616k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
> COMMAND
> 4152 postgres 17 0 2436m 176m 171m R 16 1.1 0:03.09 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4122 postgres 17 0 2431m 20m 17m R 12 0.1 0:06.38 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4007 postgres 16 0 2434m 80m 75m R 11 0.5 0:26.46 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 3994 postgres 16 0 2432m 134m 132m R 10 0.9 0:43.40 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4166 postgres 16 0 2433m 12m 8896 R 9 0.1 0:02.71 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4110 postgres 15 0 2436m 224m 217m S 8 1.4 0:06.83 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4061 postgres 16 0 2446m 491m 473m R 8 3.1 0:17.32 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4113 postgres 16 0 2432m 68m 65m R 8 0.4 0:11.03 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4071 postgres 16 0 2435m 200m 194m R 7 1.3 0:13.69 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4169 postgres 15 0 2436m 122m 117m R 7 0.8 0:00.93 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4178 postgres 16 0 2432m 77m 75m R 7 0.5 0:00.56 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4108 postgres 16 0 2437m 301m 293m R 6 1.9 0:11.94 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4155 postgres 16 0 2438m 252m 244m S 5 1.6 0:02.80 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4190 postgres 15 0 2432m 10m 8432 R 5 0.1 0:00.71 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 3906 postgres 16 0 2433m 124m 119m R 5 0.8 0:57.28 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 3970 postgres 16 0 2442m 314m 304m R 5 2.0 0:16.43 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4130 postgres 17 0 2433m 76m 72m R 5 0.5 0:03.76 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4179 postgres 16 0 2432m 105m 102m R 5 0.7 0:01.11 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4125 postgres 17 0 2436m 398m 391m R 4 2.5 0:05.62 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4162 postgres 16 0 2432m 125m 122m R 4 0.8 0:01.01 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 217m S 8 1.4 0:06.83 postgres: postgres dbEnterpriser_09_10
> 192.168.10. dbEnterpriser_09_10 192.168.10.
> 4061 postgres 16 0 2446m 491m 473m R 8 3.1 0:17.32 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4113 postgres 16 0 2432m 68m 65m R 8 0.4 0:11.03 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4071 postgres 16 0 2435m 200m 194m R 7 1.3 0:13.69 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4169 postgres 15 0 2436m 122m 117m R 7 0.8 0:00.93 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4178 postgres 16 0 2432m 77m 75m R 7 0.5 0:00.56 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4108 postgres 16 0 2437m 301m 293m R 6 1.9 0:11.94 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4155 postgres 16 0 2438m 252m 244m S 5 1.6 0:02.80 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4190 postgres 15 0 2432m 10m 8432 R 5 0.1 0:00.71 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 3906 postgres 16 0 2433m 124m 119m R 5 0.8 0:57.28 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 3970 postgres 16 0 2442m 314m 304m R 5 2.0 0:16.43 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4130 postgres 17 0 2433m 76m 72m R 5 0.5 0:03.76 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4179 postgres 16 0 2432m 105m 102m R 5 0.7 0:01.11 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4125 postgres 17 0 2436m 398m 391m R 4 2.5 0:05.62 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4162 postgres 16 0 2432m 125m 122m R 4 0.8 0:01.01 postgres:
> postgres dbEnterpriser_09_10 192.168.10.
> 4185 postgres 1
>
> *OUTPUT OF IOSTAT 1 5 (is SAN becoming a bottleneck,shows 50% CPU usage?)
> *
>
> clusternode2:~ # iostat 1 5
> Linux 2.6.16.46-0.12-ppc64 (clusternode2) 09/24/2009 _ppc64_ (4
> CPU)
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 16.00 0.00 0.68 0.61 10.72 71.99
>
> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
> sda 1.08 1.96 22.54 13505448 155494808
> sdb 0.00 0.20 0.45 1410179 3099920
> sdc 0.00 0.05 0.01 357404 78840
> scd0 0.00 0.00 0.00 136 0
> sdd 12.20 77.69 343.49 535925176 2369551848
> sde 0.00 0.00 0.00 1120 0
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 29.46 0.00 0.25 0.00 7.43 62.87
>
> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
> sda 0.00 0.00 0.00 0 0
> sdb 0.00 0.00 0.00 0 0
> sdc 0.00 0.00 0.00 0 0
> scd0 0.00 0.00 0.00 0 0
> sdd 0.00 0.00 0.00 0 0
> sde 0.00 0.00 0.00 0 0
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 46.17 0.00 0.99 0.00 38.52 14.32
>
> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
> sda 0.00 0.00 0.00 0 0
> sdb 0.00 0.00 0.00 0 0
> sdc 0.00 0.00 0.00 0 0
> scd0 0.00 0.00 0.00 0 0
> sdd 3.96 0.00 118.81 0 120
> sde 0.00 0.00 0.00 0 0
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 48.88 0.00 0.99 0.00 49.88 0.25
>
> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
> sda 0.00 0.00 0.00 0 0
> sdb 0.00 0.00 0.00 0 0
> sdc 0.00 0.00 0.00 0 0
> scd0 0.00 0.00 0.00 0 0
> sdd 0.00 0.00 0.00 0 0
> sde 0.00 0.00 0.00 0 0
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 47.86 0.00 2.14 0.00 50.00 0.00
>
> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
> sda 0.00 0.00 0.00 0 0
> sdb 0.00 0.00 0.00 0 0
> sdc 0.00 0.00 0.00 0 0
> scd0 0.00 0.00 0.00 0 0
> sdd 0.00 0.00 0.00 0 0
> sde 0.00 0.00 0.00 0 0
>
>
>
>
>
> Andy Colson Wrote : ,
> *Eww. I think that's bad. A connection that has a transaction open will
> cause lots of row versions, which use up ram, and make it slower to step
> through the table (even with an index). You really need to fix up your code
> and make sure you commit transactions. (any statement (select, insert,
> update) will start a new transaction that you need to explicitly commit).
>
> *
> With reference to this suggestion by Andy Colson, we checked the
> application code and found that onlyINSERT, UPDATE has COMMIT and SELECT
> has no commit, We are using a lot of "Ajax Suggest" in the all the forms
> accessed for fetching the data using SELECT statements which are not
> explicitly commited. We have started updating the code on this.
>
> Thanks for this suggestion.
>
>
> Again thanks to suggestion of Scott Marlowe in reducing the number of
> connections. This was now reducted to 500 .
>
>
> As i mentioned in the mail, i am restarting the database every 30 minutes.
> I found a shell script in the wiki which could the idle in transaction pids.
> This is the code. The code will kill all old pids in the server.
>
> This is the script
>
> /usr/bin/test `/usr/bin/pgrep -f 'idle in transaction' | \
>
>
> /usr/bin/wc -l ` -gt 20 && /usr/bin/pkill -o -f 'idle in transaction'
>
> and this is the link where the script was provided.
>
> http://wiki.dspace.org/index.php/Idle_In_Transaction_Problem
>
> I tried it run it as test in the server, but the script is not executing.
> Even i see many of the "Idle in transaction " PIDs are showing R (RUnning
> status) , but most of them are showing S(Sleep ) status. Please suggest
> anyway i can resolve this idle transaction issue.
>
> Regards
>
> Shiva Raman
>
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Andy Colson 2009-09-24 15:32:21 Re: High CPU load on Postgres Server during Peak times!!!!
Previous Message Shiva Raman 2009-09-24 12:50:29 Re: High CPU load on Postgres Server during Peak times!!!!