Re: rogue process maxing cpu and unresponsive to signals

From: Decibel! <decibel(at)decibel(dot)org>
To: Jon Jensen <jon(at)jenseng(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: rogue process maxing cpu and unresponsive to signals
Date: 2007-08-16 05:03:21
Message-ID: BA5A3A84-0839-4159-9990-AA40538E750B@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On Aug 15, 2007, at 9:27 PM, Jon Jensen wrote:
> I've got a simple select query that runs every 10 minutes in order
> to update data in some external rrds (it lets us make pretty graphs
> and so forth). This has been working fine for months on end, when
> suddenly yesterday the badness happen. For some reason, this same
> query that normally takes a couple seconds has now been stuck
> running for over 24 hours, maxing the CPU and generally slowing
> other queries down.
>
> The external script that initiates the query has been restarted,
> and netstat no longer shows that connection. All subsequent calls
> of the same query are quick as usual, but the renegade process
> lingers on, unresponsive to signals. Some of the things I've tried
> so far (unsuccessfully):
>
> 1. I've tried killing the process using kill from the command-line
> (INT, TERM and HUP), as well as using pg_cancel_backend() via psql.
> 2. I've tried attaching gdb to the renegade process to see what
> it's doing, but that hangs, forcing me to kill gdb (no problems
> attaching to other postgres processes however).
>
> Any other ideas? I'd like to avoid doing a kill -9 if at all
> possible. The machine is debian (sarge) running postgres 8.1.

There's a lot of parts of the code that don't check for signals,
because normally they don't run for any real length of time... until
they do. :) The factorial calculation is an example that was recently
fixed. So it's possible that something in your query is in that same
condition. You may be stuck with a kill -9, but it would be good to
identify what part of the code is hung up so we can determine if it
makes sense to add signal handling.
--
Decibel!, aka Jim Nasby decibel(at)decibel(dot)org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Decibel! 2007-08-16 05:08:37 Re: horizontal clustering?
Previous Message Jon Jensen 2007-08-16 02:27:00 rogue process maxing cpu and unresponsive to signals