Re: [HACKERS] Support for Secure Transport SSL library on macOS as OpenSSL alternative

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Support for Secure Transport SSL library on macOS as OpenSSL alternative
Date: 2018-09-26 21:19:15
Message-ID: CE440587-6093-44F6-A3CA-D1AFE6392B9B@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 26 Sep 2018, at 01:08, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
>> On 27/06/18 21:57, Daniel Gustafsson wrote:
>>> Courtesy of the ever-present Murphy I managed to forget some testcode in
>>> src/backend/Makefile which broke compilation for builds without secure
>>> transport, attached v8 patch fixes that.
>
>> I've read through this patch now in more detail. Looks pretty good, but
>> I have a laundry list of little things below. The big missing item is
>> documentation.
>
> I did some simple testing on this today. The patch has bit-rotted,
> mostly as a consequence of 77291139c which removed tls-unique channel
> binding. That's probably a good thing for the Secure Transport code,
> since it wasn't supporting that anyway, but it needs to be updated.

Thanks for taking a look!

> I ripped out the failing-to-compile code (maybe that was too simplistic?)
> but could not figure out whether there was anything still useful about
> the diffs in ssl/t/002_scram.pl, so I left those out. Anyway, the
> attached update applies cleanly to today's HEAD, and the openssl part
> still compiles cleanly and passes regression tests. The securetransport
> part compiles cleanly, but it fails 8 of the 68 tests in 001_ssltests.pl.
> I'm not sure how many of those might be new and how many were there as
> of the previous submission.

I had some local diffs that I hacked up based on Heikkis review this summer,
but I never got around to sending them out. I’ve rebased these changes on top
of your v9 patch as the attached v10. With this patch I get 5 failing tests on
High Sierra.

>> The "-framework" option, being added to CFLAGS, is clang specific. I
>> think we need some more autoconf magic, to make this work with gcc.
>
> AFAIK, Apple's "gcc" *is* clang; it certainly has no problem with
> these switches (and I rather doubt there's any hope of linking to
> Secure Transport without 'em).

That is correct, there is however q legitimate question as to how to support
those who install an actual gcc via for example homebrew. -framework is a
linker option which according to the GCC docs isn’t being passed to the linker
by GCC. AFAICT from reading, -framework is however merely a convenient way of
using -l, so it should be doable to just use normal -l for both clang and gcc,
though I haven’t researched that. Another option is to require a macOS
clang(gcc) for secure transport support, at least initially.

It’s not clear to me just how common it is to use GCC via homebrew on macOS.

> However, I agree that the technique
> of teaching random makefiles about this explicitly is mighty ugly,
> and we ought to put the logic into configure instead, if possible.
> Possibly it could be modeled on LDAP_LIBS or ICU_LIBS, ie configure
> sets up a macro that pulls in the openssl libraries, or the
> secure transport libraries, or $other-implementation, or nothing.
> The CFLAGS hacks need similar treatment (btw, should they be
> CPPFLAGS hacks instead? I think they're morally equivalent to
> -I switches). And avoid using "override" if at all possible.

Agreed, thats a better idea.

> Some other comments:
>
> * I notice that contrib/sslinfo hasn't been touched. That probably
> ought to be on the to-do list for this, though I don't insist that
> it needs to be in the first commit.

Right, that one is on the todo.

> * I'm not sure what the "keychain" additions to test/ssl/Makefile
> are for, but they don't seem to be wired up to anything. Running
> "make keychains" has no impact on the test failures, either.

Since keychain files are binary blobs the make target is intended to create the
required keychain files out of the existing certificate/keys for use with the
SSL tests. Generating keychains needs to be wired to running the tests on
secure transport enabled systems (iff we want to have tests for keychains of
course).

> * I do not like making the libpq connection parameters for this be
> #ifdef'd out when the option isn't selected. I believe project policy is
> that we accept all parameters always, and either ignore unsupported ones,
> or throw errors if they're set to nondefault values (cf. the comment above
> the sslmode parameter in fe-connect.c). I realize that some earlier
> patches like GSSAPI did not get the word on this, but that's not a reason
> to emulate their mistake. I'm not sure about the equivalent decision
> w.r.t. backend GUCs, but we need to figure that out.

Makes sense. I added these just after the #if defined(ENABLE_GSS) lines and
just took a page from that playbook. Haven’t changed in the attached but agree
it should be done.

> * In place of changes like
> -#ifdef USE_SSL
> +#if defined(USE_SSL) && defined(USE_OPENSSL)
> I'd be inclined to just do "#ifdef USE_OPENSSL", ie assume that macro
> can't be set without USE_SSL.

Good point, fixed in the attached.

Also, below are responses to the laundrylist raised by Heikki upthread:

> --- laundry list begins ---
>
> What exactly does 'ssl_is_server_start' mean? I don't understand that mechanism. ISTM it's only checked in load_key(), via be_tls_open_server(), which should only be called after be_tls_init(), in which case it's always 'true' when it's checked. Should there be an Assertion on it or something?

Correct, this was a broken attempt at handling the server reload that didn’t
work due to brainfade. Removed and tidied up.

> In be_tls_open_server(), I believe there are several cases of using variables uninitialized. For example, load_identity_keychain() doesn't always set the 'identity' output parameter, but there's an "if (identity == NULL)" check after the call to it. And 'certificates' is left uninitialized, if load_identity_keychain() is used. Also, 'dh_len' is used uninitialized here in the "if (!dh_buf || dh_len == 0)" line, if the 'ssl_dh_params_file' option is not set. Did clang not give warnings about these?’

Nope, no warnings were issueed. I do agree with your findings and have fixed
these and looked for others.

>> + /*
>> + * Certificate Revocation List are not supported in the Secure Transport
>> + * API
>> + */
>
> The corresponding client code has a longer comment on that:
>
>> + /*
>> + * There is no API to load CRL lists in Secure Transport, they can however
>> + * be imported into a Keychain with the commandline application "certtool".
>> + * For libpq to use them, the certificate/key and root certificate needs to
>> + * be using an identity in a Keychain into which the CRL have been
>> + * imported. That needs to be documented.
>> + */
>
> Is it possible to programmatically create a temporary keychain, in memory, and load the CRL to it? (I googled a bit, and couldn't find any suitable API for it, so I guess not..)

I attempted this first using the lowlevel CSSM/CDSA APIs, but deleted it again
as it turned into well over 1500 lines of invoking undocumented CSSM API calls.
And it wasn’t event a complete support at that point. No official APIs exist
for dealing with CRL files. Having the user load the CRL into the Keychain is
about the only thing we can do.

>> + if (ssl_crl_file[0])
>> + ereport(FATAL,
>> + (errmsg("CRL files not supported with Secure Transport")));
>
> I think this should be COMMERROR, like other errors around this. We don't want to pass that error message to the client. Although I guess it won't reach the client as we haven't negotiated TLS yet.

Fixed.

>> + /*
>> + * We use kTryAuthenticate here since we don't know which sslmode the
>> + * client is using. If we were to use kAlwaysAuthenticate then sslmode
>> + * require won't work as intended.
>> + */
>> + if (ssl_loaded_verify_locations)
>> + SSLSetClientSideAuthenticate((SSLContextRef) port->ssl, kTryAuthenticate);
>
> That comment sounds wrong. This is in backend code, and SSLSetClientSideAuthenticate() is all about checking the client's certificate in the server, while libpq 'sslmode' is about checking the server's certificate in the client.

Fixed.

>> + for (;;)
>> + {
>> + status = SSLHandshake((SSLContextRef) port->ssl);
>> + if (status == noErr)
>> + break;
>> +
>> + if (status == errSSLWouldBlock)
>> + continue;
>> + ...
>> + }
>
> Does this busy-wait, if there's not enough data from the client? That seems bad. In the corresponding client-side code, there's a comment on that too, but it's still bad.

It does busy-wait, and was doing so because I had misunderstood the Secure
Transport documentation. I’ve fixed it now similarly to how the OpenSSL code
handles it. Fixing it client-side remains a TODO.

> In be_tls_get_version and PQsslAttribute, can we add support for kTLSProtocol13? Perhaps with an autoconf test, if the constant is not available on all macOS versions.

Good point, I’ve added an autoconf test for it with an ifdef around using it in
the code.

> In be_tls_get_peerdn_name, wouldn't SecCertificateCopyCommonName() be more appropriate than SecCertificateCopyLongDescription()?

The OpenSSL support use X509_get_subject_name() which afaict returns the full
subject and not just the CN, so SecCertificateCopyLongDescription() should be
the equivalent.

> In be_tls_get_cipher, returning "unknown" would seem better than erroring out, if the cipher isn't recognized.

Fixed.

> Check error message style. eg.:
>
>> + ereport(COMMERROR,
>> + (errmsg("could not load server certificate \"%s\": \"%s\"",
>> + ssl_cert_file, pg_SSLerrmessage(status))));
>
> Why is the result of pg_SSLerrmessage() in quotes? Maybe errdetail?

I really don’t know why I had put it in quotes, but it was clearly not a good
idea. Fixed all occurrences to not use quotes, but kept them in errmsg() for
now.

>> + * Any consumers of the Keychain array should always call this to ensure that
>> + * it is set up in a manner that reflect the configuration. If it not, then
>
> s/reflect/reflects/

Fixed.

>> + else if (keychain_count == 2)
>> + {
>> + if (ssl_keychain_file[0] == '\0' || !ssl_keychain_use_default)
>> + goto error;
>> +
>> + return;
>> + }
>> + else
>> + /* Fallthrough to erroring out */
>> +
>> +error:
>> + ereport(FATAL,
>> + (errmsg("Incorrect loading of Keychains detected")));
>> +}
>
> That fallthrough looks dangerous.

Fixed and refactored the function to be clearer while at it. The default for
the ssl_keychain_use_default GUC is set to false, which seems a safe bet.
During hacking I’ve temporarily set it to ‘on’ in the testsuite though, the
keychain setup for the tests needs to be figured out (as in how and what do we
want to test).

>> --- a/src/backend/utils/misc/postgresql.conf.sample
>> +++ b/src/backend/utils/misc/postgresql.conf.sample
>> @@ -106,6 +106,7 @@
>> #ssl_dh_params_file = ''
>> #ssl_passphrase_command = ''
>> #ssl_passphrase_command_supports_reload = off
>> +#ssl_keychain_file = ''
>
> Do we want to have this Secure Transport-specific option in the sample config file? Comment at least

On second thought I don’t think we do, so removed this.

>> +#elif USE_SECURETRANSPORT
>> + void *ssl;
>> + int ssl_buffered;
>
> Add comments. Mention the actual type of 'ssl' (SSLContextRef), and what 'ssl_buffered' means.

Fixed.

> libpq-be.h and libpq-int.h use a different strategy to avoid clashes between PostgreSQL's and Secure Transport's versions of Size and uint64. Let's be consistent. (I like the libpq-fe.h version more, i.e. using "void *" and avoiding the #include)

Yeah, I’ve been meaning to fix that but clearly forgot. Fixed now.

>> - * SSL implementation (e.g. be-secure-openssl.c)
>> + * SSL implementation (e.g. be-secure-<implementation>.c)
>
> Since this is just an example, I think leaving it as be-secure-openssl.c is fine.

Fixed.

> Also update pg_config.h.win32 with the new USE_SECURETRANSPOT flag being added to pg_config.h.in.

Fixed.

> In the call to SSLSetPeerDomainName(), should check return code.

Fixed.

>> + /*
>> + * In sslmode "require" we accept some certificate verification
>> + * failures when we don't have a rootcert since MITM protection
>> + * isn't enforced. Check the reported failure and trust in case
>> + * the cert is missing, self signed or expired/future.
>> + */
>> + if (strcmp(conn->sslmode, "require") == 0 && !conn->st_rootcert)
>> + {
>
> Not just "require", but "allow" as well, right?
>
> freePGconn should free conn->keychain. Would be good to write a little test that opens & closes millions of connectins, to check for memory leaks.

Fixed (the free part, not the util but I agree it would be interesting).

>> + * Queries the specified Keychain, or the default unless not defined, for a
>
> "unless not defined" is a double-negative. I don't understand that sentence. (there are two copies of the same comment in the patch, in FE and BE. And the FE function is called "load_identity_keychain", but its comment says "import_identity_keychain” )

Fixed.

> PQinitOpenSSL and PQinitSSL are inconsistent in whether to call pgtls_init_library(), when compiled with Secure Transport. pgtls_init_library() is a no-op, so it doesn't matter, but let's be consistent. Perhaps do "#define pgtls_init_library() ((void)true)" in the header?

Fixed.

> s/readiblitity/readability/
> s/dont/don't/
> s/securetransport_common.h/securetransport.h/
> s/f.e/for example/ (at least I'm not familiar with that abbreviation)

Fixed.

cheers ./daniel

Attachment Content-Type Size
secure-transport-v10.patch application/octet-stream 138.5 KB
unknown_filename text/plain 2 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-09-26 21:41:36 Re: Allowing printf("%m") only where it actually works
Previous Message Tomas Vondra 2018-09-26 21:09:07 Re: [PATCH] Improve geometric types