Re: pg_cancel_backend() does not work with buzz queries

From: Richard Huxton <dev(at)archonet(dot)com>
To: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org >> PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject: Re: pg_cancel_backend() does not work with buzz queries
Date: 2007-10-03 11:47:32
Message-ID: 47038154.9020307@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-ru-general

Sergey Konoplev wrote:
>> Don't forget to cc: the list.
>> Try not to top-post replies, it's easier to read if you reply below the
>> text you're replying to.
>
> Thanx for your advice. I'm just absolutely worned out. Sorry.

Know that feeling - let's see if we can't sort this out.

>>>> 1. Is it always the same query?
>>>> 2. Does the client still think it's connected?
>>>> 3. Is that query using up CPU, or just idling?
>>>> 4. Anything odd in pg_locks for the problem pid?
>>> 1. No it isn't. I have few functions (plpgsql, plpython) that cause
>>> such situations more often than another but they are called more often
>>> also.
>> OK, so there's no real pattern. That would suggest it's not a particular
>> query-plan that's got something wrong.
>>
>> Do you always get this problem inside a function?
>
> As far as I remember I do.

Hmm - check Magnus' thoughts on pl/python. Can't comment on Python
myself. Are you sure it's not always the same few function(s) that cause
this problem?

>>> 2. The client just waits for query and buzz.
>>> 3. They are using CPU in usual way and their pg_lock activity seems normal.
>> So the backend that appears "stuck" is still using CPU?
>
> Yes but the metter is that this procedures usualy use CPU just a
> little so I can't find out if there is some oddity or not.

OK, so it's not that it's stuck in a loop wasting a lot of CPU

>> So - the symptoms are:
[snip]
> Exactly.

So - we need to solve two mysteries
1. Why are these functions not returning?
2. Why does SIGINT not interrupt them?

>> Are you happy that your hardware and drivers are OK? There aren't
>> problems with any other servers on this machine?
>
> Yes I'm quite happy. My hardware is: 2 double-core Xeon, 8Gb RAM,
> RAID5. What about other software... it's dedicated PG server so I have
> no problem with it.

Well, the places I'd look would be:
1. Hardware (you're happy that's fine, and it's not quite the problems
I'd expect)
2. Drivers (same as #1)
3. Client connectivity (but you say the client is fine)
4. External interactions (see below)
5. Bug in PG extension (pl/python)
6. Bug in PG core code

Do any of your functions interact with the outside world - fetch
webpages or similar? It could be they're waiting for that. If you're
using a library that could hang waiting for a response and also block
SIGINT at the same time that would explain everything.

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alban Hertroys 2007-10-03 12:09:54 Tsearch2 Dutch snowball stemmer in PG8.1
Previous Message Sergey Konoplev 2007-10-03 11:45:01 Re: pg_cancel_backend() does not work with buzz queries

Browse pgsql-ru-general by date

  From Date Subject
Next Message Заяц Алексей 2007-10-03 12:07:10 Re: Fwd: [pgsql-ru-general] pg_cancel_backend() не снимает зависшие транзакции
Previous Message Sergey Konoplev 2007-10-03 11:45:01 Re: pg_cancel_backend() does not work with buzz queries