Re: Huge spikes in number of connections doing "PARSE"

From: Noah Misch <noah(at)leadboat(dot)com>
To: hubert depesz lubaczewski <depesz(at)depesz(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Huge spikes in number of connections doing "PARSE"
Date: 2011-03-11 16:13:43
Message-ID: 20110311161343.GB29175@tornado.gateway.2wire.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Mar 11, 2011 at 04:13:52PM +0100, hubert depesz lubaczewski wrote:
> On Fri, Mar 11, 2011 at 03:03:55AM -0500, Noah Misch wrote:
> > On Wed, Mar 09, 2011 at 09:38:07PM +0100, hubert depesz lubaczewski wrote:
> > > So. every now and then (couple of times per day at most). I see hundreds
> > > (800-900) of connections in "PARSE" state.
> > >
> > > I did notice one thing.
> > >
> > > we do log output of ps axo user,pid,ppid,pgrp,%cpu,%mem,rss,lstart,nice,nlwp,sgi_p,cputime,tty,wchan:25,args
> > > every 15 seconds or so.
> > >
> > > And based on its output, I was able to get stats of "wchan" of all PARSE
> > > pg processes when the problem was logged.
> > > Results:
> > >
> > > 805 x semtimedop
> > > 10 x stext
> > >
> > > Any ideas on what could be wrong? Machine was definitely not loaded most
> > > of the times it happened.
> > >
> > > The problem usually goes away in ~ 10-15 seconds.
> >
> > Would you have your monitoring process detect this condition and capture stack
> > traces, preferably from a gdb with access to debug information, of several of
> > these processes? That will probably make the specific contention point clear.
>
> unfortunately debug was not enabled on this server, and changing
> binaries would be rather complicated as it's production environment.

Understood. Not a critical problem, most likely.

> i'm not a c programmer, can you tell me how to get stack trace (assuming
> it makes any sense without debug enabled) without damaging the process
> in any way?

gdb -ex=bt /path/to/bin/postgres $pid </dev/null

I've used this on production systems to debug issues like this one, and I've
never observed damage. The exact effect of debugger attach/detach may be
OS/kernel-dependent, so it's hard to make categorical guarantees of safety.

nm

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message runner 2011-03-11 17:44:24 How do you change the size of the WAL files?
Previous Message akp geek 2011-03-11 15:54:57 ERROR: Failed with error 22007-invalid value "" for "mm" vacuumdb