Re: [SPAM] SSL renegotiation

From: Sean Chittenden <sean(at)chittenden(dot)org>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [SPAM] SSL renegotiation
Date: 2013-07-10 23:58:07
Message-ID: 51DDF50F.2030402@chittenden.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Wow, that was a long time ago... I remember a few things about this

1) I was running in to an issue where every 64KB of transfer (or
something inanely low like that), SSL was being renegotiated. This was
causing performance problems over the wire. I think we settled on once
an hour renegotiating the key, but I'd have to look again to check. It
looks like the default is now 512MB, which is a more sane limit, but
still pretty easy to exhaust - I have several hosts that would run past
that default every 45sec or so, probably faster at peak times.

2) The system that I was running this on was Solaris 2.5, I believe,
and /dev/random was having a problem staying populated given the
frequent renegotiations, which prompted me to look in to this. In your
testing and attempts to repro, try draining your prng pool, or patch
things on Linux to read from /dev/random instead of /dev/urandom...
something like that may be at fault and why limited testing won't
expose this, but under load you might see it. *shrug* A WAG, but
possibly relevant.

Rough notes inline below (almost all of this should be wrapped in an
<iirc> block):

> if (ssl_renegotiation_limit && port->count > ssl_renegotiation_limit * 1024L)

This doesn't seem right. 512MB * 1024? Maybe that's why I haven't
actually had to play with this limit in a long time. Every 512GB is
much more reasonable in that it would take 12hrs to renegotiate on a
busy host. The "* 1024L" seems suspicious to me and should probably be
removed in favor of the constant passed in from the config.

> {
> SSL_set_session_id_context(port->ssl, (void *) &SSL_context,
> sizeof(SSL_context));
> if (SSL_renegotiate(port->ssl) <= 0)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL renegotiation failure")));

This sets a bit asking the peer to renegotiation.

> if (SSL_do_handshake(port->ssl) <= 0)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL renegotiation failure")));
> if (port->ssl->state != SSL_ST_OK)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL failed to send renegotiation request")));

Push the request to renegotiate out to the peer and check the status.

> port->ssl->state |= SSL_ST_ACCEPT;
> SSL_do_handshake(port->ssl);

In OpenSSL 0.9.6 this was the correct way to renegotiate a connection
and I would need to confirm if this is still required for OpenSSL >=
0.9.7.

> if (port->ssl->state != SSL_ST_OK)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL renegotiation failure")));
> port->count = 0;
> }
>
> I call your attention to the fact that we're calling SSL_do_handshake
> twice here; and what's more, we're messing with the internal
> port->ssl->state variable by setting the SSL_ST_ACCEPT bit. According
> to the commit message that introduces this[1], this is "text book
> correct", but I'm unable to find the text book that specifies that this
> is correct. Is it? I have gone through the OpenSSL documentation and
> can find not a single bit on this; and I have also gone through the
> OpenSSL mailing list archives and as far as I can tell they say you have
> to call SSL_renegotiate() and then SSL_do_handshake() once, and that's
> it. (You can call SSL_renegotiate_pending() afterwards to verify that
> the renegotiation completed successfully, which is something we don't
> do.)

It would not surprise me if we could #ifdef out the "|= SSL_ST_ACCEPT"
and second call to SSL_do_handshake() if we're running OpenSSL 0.9.7 or
newer. iirc, 0.9.7's big feature was improving renegotiation.

> I have found in our archives several reports of people that get "SSL
> renegotiation failed" error messages in the log, and no explanation has
> ever been found. I instrumented that block, and I have observed that
> after the second handshake call, ssl->state is 0x2111, 0x21c0, 0x21a0
> and other values; never SSL_ST_OK. (0x2000 is SSL_ST_ACCEPT which is
> the bit we added; SSL_ST_OK is 0x03). If I remove the second
> SSL_do_handshake() call and the changes to port->ssl->state, everything
> appears to work perfectly well and there's no extra log spam (and
> ssl->state is SSL_ST_OK by the time things are finished). So apparently
> renegotiation is not actually failing, but we're just adding a confusing
> error message for no apparent reason.
>
> Thoughts?

Since 0.9.6 almost certainly has vulnerabilities that haven't been
fixed, can we just depreciate anything older than OpenSSL 0.9.7 or
#ifdef the above out? 0.9.7 is also ancient, but we're talking about
new releases... and while we're at it, can we prefer PFS ciphers?

> I have still to find where the actual problems are happening in
> unreliable networks ...

I'm guessing you're blocking on /dev/random on some systems and that's
the source of unreliability/timeouts.

> [1] commit 17386ac45345fe38a10caec9d6e63afa3ce31bb9
> Author: Bruce Momjian <bruce(at)momjian(dot)us>
> Date: Wed Jun 11 15:05:50 2003 +0000
>
> patch by Sean Chittenden (in CC), posted as [2]
>
> [2] http://www.postgresql.org/message-id/20030419190821.GQ79923@perrin.int.nxad.com

If you want, I can dig in to this further.... my brain's way back
machine isn't as on-demand as it used to be, but I can certainly help
clean some of this up. -sc

--
Sean Chittenden

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Troels Nielsen 2013-07-11 01:38:02 Re: SSL renegotiation
Previous Message Alvaro Herrera 2013-07-10 22:34:44 Re: SSL renegotiation

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2013-07-11 00:16:04 Re: pgbench patches
Previous Message Bernd Helmle 2013-07-10 23:03:09 Re: psql tab completion for updatable foreign tables