Re: Killing "stuck" queries and preventing queries from getting "stuck"

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Tim Uckun <timuckun(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Killing "stuck" queries and preventing queries from getting "stuck"
Date: 2010-09-28 06:21:12
Message-ID: 4CA18958.7020106@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 28/09/10 11:25, Tim Uckun wrote:
> On Tue, Sep 28, 2010 at 3:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Tim Uckun <timuckun(at)gmail(dot)com> writes:
>>> Is there a way to tell postgres to stop any query that runs longer
>>> than a specified amount of time? Say an hour?
>>
>> Setting statement_timeout would do that. You ought to figure out
>> what's causing the performance problem, though, instead of just
>> zapping things ...
>
> Well the query is pretty heavy but it gets run a lot. There is a
> distinct in there which seems to be the cause of most of the headaches
> but it's going to take a while to redo the application to not use
> distinct.
>
> The query gets run a lot and 99.99% of the time it runs succesfully
> and the daemon goes on it's merry way. Occasionally it seems to "get
> stuck" and killing the daemon does not unstick it.

Useful things to try when you have a "stuck" backend:

- attach strace to it and see if it's doing anything
that involves system calls

- attach gdb to it and get a backtrace to see what
it's up to. If it's using CPU, do this multiple times
to see if it's in some kind of infinite loop, as you'll
get a snapshot of different stacks if so. See:

http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

- (on linux; you didn't mention your OS):
cat /proc/$pid/stack , where $pid is the process id
of the stuck backend, to see what the backend process is
up to in the kernel.

... then post the output of all those tests here, along with the
contents of "select * from pg_stat_activity", "select * from pg_locks"
and anything from the postgresql log files that looks possibly relevant.

--
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

In response to

Browse pgsql-general by date

  From Date Subject
Next Message AI Rumman 2010-09-28 06:37:46 Implicit CAST is not working in Postgresql 8.4
Previous Message novnovice 2010-09-28 06:18:18 Re: Merge replication with Postgresql on Windows?