Skip site navigation (1) Skip section navigation (2)

R: R: R: R: R: BUG #6342: libpq blocks forever in "poll" function

From: "Andrea Grassi" <andreagrassi(at)sogeasoft(dot)com>
To: "'Craig Ringer'" <ringerc(at)ringerc(dot)id(dot)au>,"'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <harrywr2(at)comcast(dot)net>,"'Pg Bugs'" <pgsql-bugs(at)postgresql(dot)org>,"'Alvaro Herrera'" <alvherre(at)commandprompt(dot)com>
Subject: R: R: R: R: R: BUG #6342: libpq blocks forever in "poll" function
Date: 2011-12-21 11:46:01
Message-ID: 000c01ccbfd6$1d5a5e70$580f1b50$@com (view raw or flat)
Thread:
Lists: pgsql-bugs
Then I meet my colleague who is the systems engineer that takes care of the
machine and I explain your hints (suggested by Craig Ringer) about how
detect and log kernel issues.
If it can be useful, the content of file /proc/$pid/wchan in the moment of
block is "_stext".

In the meantime, to be sure that it could not been a libpq bug, I ask you
one thing.
In internet I searched for detailed specifications of poll/select system
functions but I didn't understand one thing, that is which one of the 2
statement is true:
1) poll/select wait only for FUTURE modifications of ready-read state of
sockets
2) poll/select check if there is something to read at the moment of the call
and otherwise wait for FUTURE modifications of ready-read state
 
Because if it was true the first statement, it could be that the answer of
the server arrives between the request and the call of poll (this time is
surely very short but however strictly greater than 0 and in this interval
the server answer could arrive). 
Theoretical sequence: 
1) Client request to server 
2) Server answer to client
3) client wait calling poll
In this case client and server go in a sort of deadlock because server and
client wait each other for the other and could be a libpq bug.

What do you think about ? This scenario could be possible or the true
statement is the second ?

Regard, Andrea



-----Messaggio originale-----
Da: Craig Ringer [mailto:ringerc(at)ringerc(dot)id(dot)au] 
Inviato: mercoledì 21 dicembre 2011 0.56
A: Tom Lane
Cc: Andrea Grassi; harrywr2(at)comcast(dot)net; 'Pg Bugs'; 'Alvaro Herrera'
Oggetto: Re: R: R: R: R: [BUGS] BUG #6342: libpq blocks forever in "poll"
function

On 21/12/2011 1:42 AM, Tom Lane wrote:
> Hrm.  What's with the 48 bytes in the client's receive queue?  Surely
> the kernel should be reporting that the socket is read-ready, if it's
> got some data.  I think you've found an obscure kernel bug ---- somehow
> it's failing to wake the poll() caller.
>
I've been leaning that way too; that's why I was asking him for 
/proc/$pid/stack and `wchan -C programname -o wchan:80=` output - to get 
some idea of what function in the kernel it's sitting in.

Unfortunately the OP is on some enterprise distro that doesn't have 
/proc/$pid/stack . wchan info would still be useful. I wonder how old 
their kernel is? The bug could've already been fixed. /proc/pid/stack 
has been around since 2008 so it must be pretty elderly.

OP: You can also get a kernel stack for a process by enabling the magic 
SysRQ key (see Google) then using Alt-SysRq-T . This requires a physical 
keyboard directly connected to the server. It emits the stack 
information via dmesg. See:

http://en.wikipedia.org/wiki/Magic_SysRq_key

There's a "sysrqd" that apparently lets you use these features remotely, 
but I've never tried it.

--
Craig Ringer


In response to

Responses

pgsql-bugs by date

Next:From: wilsoncamagoDate: 2011-12-21 14:26:39
Subject: BUG #6348: PROBLEMAS DELETE
Previous:From: Peter GeogheganDate: 2011-12-21 11:21:44
Subject: Re: Incorrect comment in heapam.c

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group