Skip site navigation (1) Skip section navigation (2)

Re: rogue process maxing cpu and unresponsive to signals

From: Decibel! <decibel(at)decibel(dot)org>
To: Jon Jensen <jon(at)jenseng(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: rogue process maxing cpu and unresponsive to signals
Date: 2007-08-16 05:03:21
Message-ID: BA5A3A84-0839-4159-9990-AA40538E750B@decibel.org (view raw or flat)
Thread:
Lists: pgsql-novice
On Aug 15, 2007, at 9:27 PM, Jon Jensen wrote:
> I've got a simple select query that runs every 10 minutes in order  
> to update data in some external rrds (it lets us make pretty graphs  
> and so forth). This has been working fine for months on end, when  
> suddenly yesterday the badness happen. For some reason, this same  
> query that normally takes a couple seconds has now been stuck  
> running for over 24 hours, maxing the CPU and generally slowing  
> other queries down.
>
> The external script that initiates the query has been restarted,  
> and netstat no longer shows that connection. All subsequent calls  
> of the same query are quick as usual, but the renegade process  
> lingers on, unresponsive to signals. Some of the things I've tried  
> so far (unsuccessfully):
>
> 1. I've tried killing the process using kill from the command-line  
> (INT, TERM and HUP), as well as using pg_cancel_backend() via psql.
> 2. I've tried attaching gdb to the renegade process to see what  
> it's doing, but that hangs, forcing me to kill gdb (no problems  
> attaching to other postgres processes however).
>
> Any other ideas? I'd like to avoid doing a kill -9 if at all  
> possible. The machine is debian (sarge) running postgres 8.1.

There's a lot of parts of the code that don't check for signals,  
because normally they don't run for any real length of time... until  
they do. :) The factorial calculation is an example that was recently  
fixed. So it's possible that something in your query is in that same  
condition. You may be stuck with a kill -9, but it would be good to  
identify what part of the code is hung up so we can determine if it  
makes sense to add signal handling.
-- 
Decibel!, aka Jim Nasby                        decibel(at)decibel(dot)org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)



In response to

Responses

pgsql-novice by date

Next:From: Decibel!Date: 2007-08-16 05:08:37
Subject: Re: horizontal clustering?
Previous:From: Jon JensenDate: 2007-08-16 02:27:00
Subject: rogue process maxing cpu and unresponsive to signals

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group