Re: Sudden slow down and spike in system CPU causes max_connections to get exhausted

From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Anand Kumar, Karthik" <Karthik(dot)AnandKumar(at)classmates(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Sudden slow down and spike in system CPU causes max_connections to get exhausted
Date: 2014-01-07 22:22:54
Message-ID: CAL_0b1tJOZCx3Lo3Eve1RqGaT+JJ_Q7w4pkJ87WfWwXbTugnxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Jan 6, 2014 at 6:24 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Anand Kumar, Karthik" <Karthik(dot)AnandKumar(at)classmates(dot)com> writes:
>> We run postgres 9.1.11, on Centos 6.3, and an ext2 filesystem
>> Everything will run along okay, and every few hours, for about a couple of minutes, postgres will slow way down. A "select 1" query takes between 10 and 15 seconds to run, and the box in general gets lethargic.
>> This causes a pile up of connections at the DB, and we run out of max_connections.
>> This is accompanied with a steep spike in system CPU and load avg. No spike in user CPU or in I/O.
>
> System CPU only huh? There have been some reports of such behavior
> apparently caused by inefficiencies in the kernel's support of
> "transparent huge pages". See for instance this thread
>
> http://www.postgresql.org/message-id/flat/CABMVzL2y8mRM5C9xxejAyDqe0i1S78RAE3cEATGYNf5Ktz_Zdg(at)mail(dot)gmail(dot)com
>
> although it looks like in that case the real fix was to reduce the number
> of backends.

I experienced the THP defragmentation problem even with <10
connections. What always saves me is to set

echo always > /sys/kernel/mm/transparent_hugepage/enabled
echo madvise > /sys/kernel/mm/transparent_hugepage/defrag

, the names might be slightly different on CentOS, like
redhat_transparent_hugepage or something like this, I don't remember
exactly.

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray(dot)ru(at)gmail(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Joe Van Dyk 2014-01-07 22:37:42 Re: Is there a way to return "true"/"false" string for boolean type?
Previous Message Rich Shepard 2014-01-07 21:06:10 Re: Server Crash: Issues Re-starting Postgres [RESOLVED]