Re: Idle processes chewing up CPU?

From: "Brendan Hill" <brendanh(at)jims(dot)net>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "'Craig Ringer'" <craig(at)postnewspapers(dot)com(dot)au>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Idle processes chewing up CPU?
Date: 2009-12-30 02:47:32
Message-ID: 005401ca88fa$70ac6bf0$520543d0$@net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Tom,

I think I've confirmed the fix. Using a dirty disconnect generator, I was
able to reliably recreate the problem within about 30-60 seconds. The
symptoms were the same as before, however it occurred around SSL_write
instead of SSL_read - I assume this was due to the artificial nature of the
dirty disconnect (easier for the client to artificially break the connection
while waiting/receiving, than sending).

The solution you proposed solved it for SSL_write (ran for 30 minutes, no
runaway processes), and I think it's safe to assume SSL_read too. So I
suggest two additions:

====================================================
rloop:
+ errno = 0;

n = SSL_read(port->ssl, ptr, len);
err = SSL_get_error(port->ssl, n);
switch (err)
{
case SSL_ERROR_NONE:
port->count += n;
break;
====================================================

And:

====================================================
wloop:
+ errno = 0;

n = SSL_write(port->ssl, ptr, len);
err = SSL_get_error(port->ssl, n);
switch (err)
{
case SSL_ERROR_NONE:
port->count += n;
break;
====================================================

I'm not comfortable running my own compiled version in production (it was
rather difficult to get it working), so I'm interested to know when the next
release is planned. We can test beta copies on a non-critical load balancing
server if necessary.

Cheers,
-Brendan

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Sunday, 27 September 2009 2:42 PM
To: Brendan Hill
Cc: 'Craig Ringer'; pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Idle processes chewing up CPU?

"Brendan Hill" <brendanh(at)jims(dot)net> writes:
> Makes sense to me. Seems to be happening rarely now.

> I'm not all that familiar with the open source process, is this likely to
be
> included in the next release version?

Can you confirm that that change actually fixes the problem you're
seeing? I'm happy to apply it if it does, but I'd like to know that
the problem is dealt with.

regards, tom lane

> -----Original Message-----
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Sent: Monday, 21 September 2009 5:25 AM
> To: Brendan Hill
> Cc: 'Craig Ringer'; pgsql-general(at)postgresql(dot)org
> Subject: Re: [GENERAL] Idle processes chewing up CPU?

> "Brendan Hill" <brendanh(at)jims(dot)net> writes:
>> My best interpretation is that an SSL client dirty disconnected while
>> running a request. This caused an infinite loop in pq_recvbuf(), calling
>> secure_read(), triggering my_sock_read() over and over. Calling
>> SSL_get_error() in secure_read() returns 10045 (either connection reset,
> or
>> WSAEOPNOTSUPP, I'm not sure) - after this, pq_recvbuf() appears to think
>> errno=EINTR has occurred, so it immediately tries again.

> I wonder if this would be a good idea:

> #ifdef USE_SSL
> if (port->ssl)
> {
> int err;

> rloop:
> + errno = 0;
> n = SSL_read(port->ssl, ptr, len);
> err = SSL_get_error(port->ssl, n);
> switch (err)
> {
> case SSL_ERROR_NONE:
> port->count += n;
> break;

> It looks to me like the basic issue is that pq_recvbuf is expecting
> a relevant value of errno when secure_read returns -1, and there's
> some path in the Windows case where errno doesn't get set, and if
> it just happens to have been EINTR then we've got a loop.

> regards, tom lane

> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Nick 2009-12-30 03:46:47 postgresql/postgis installation
Previous Message Craig Ringer 2009-12-30 01:54:11 Re: DataBase Problem