Re: Support for NSS as a libpq TLS backend

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Jacob Champion <pchampion(at)vmware(dot)com>, "hlinnaka(at)iki(dot)fi" <hlinnaka(at)iki(dot)fi>, "andrew(dot)dunstan(at)2ndquadrant(dot)com" <andrew(dot)dunstan(at)2ndquadrant(dot)com>, "thomas(dot)munro(at)gmail(dot)com" <thomas(dot)munro(at)gmail(dot)com>, "michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>
Subject: Re: Support for NSS as a libpq TLS backend
Date: 2021-03-25 23:22:33
Message-ID: 051B5E92-066F-4079-8C40-01009CFEBC4B@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 23 Mar 2021, at 20:04, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>
> Greetings,
>
> * Daniel Gustafsson (daniel(at)yesql(dot)se) wrote:
>>> On 22 Mar 2021, at 00:49, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>>
>> Thanks for the review! Below is a partial response, I haven't had time to
>> address all your review comments yet but I wanted to submit a rebased patchset
>> directly since the current version doesn't work after recent changes in the
>> tree. I will address the remaining comments tomorrow or the day after.
>
> Great, thanks!
>
>> This rebase also includes a fix for pgtls_init which was sent offlist by Jacob.
>> The changes in pgtls_init can potentially be used to initialize the crypto
>> context for NSS to clean up this patch, Jacob is currently looking at that.
>
> Ah, cool, sounds good.
>
>>> They aren't the same and it might not be
>>> clear what's going on if one was to somehow mix them (at least if pr_fd
>>> continues to sometimes be a void*, but I wonder why that's being
>>> done..? more on that later..).
>>
>> To paraphrase from a later in this email, there are collisions between nspr and
>> postgres on things like BITS_PER_BYTE, and there were also collisions on basic
>> types until I learned about NO_NSPR_10_SUPPORT. By moving the juggling of this
>> into common/nss.h we can use proper types without introducing that pollution
>> everywhere. I will address these places.
>
> Ah, ok, and great, that sounds good.

>>>> + /*
>>>> + * Set the fallback versions for the TLS protocol version range to a
>>>> + * combination of our minimal requirement and the library maximum. Error
>>>> + * messages should be kept identical to those in be-secure-openssl.c to
>>>> + * make translations easier.
>>>> + */
>>>
>>> Should we pull these error messages out into another header so that
>>> they're in one place to make sure they're kept consistent, if we really
>>> want to put the effort in to keep them the same..? I'm not 100% sure
>>> that it's actually necessary to do so, but defining these in one place
>>> would help maintain this if we want to. Also alright with just keeping
>>> the comment, not that big of a deal.
>>
>> It might make sense to pull them into common/nss.h, but seeing the error
>> message right there when reading the code does IMO make it clearer so it's a
>> doubleedged sword. Not sure what is the best option, but I'm not married to
>> the current solution so if there is consensus to pull them out somewhere I'm
>> happy to do so.
>
> My thought was to put them into some common/ssl.h or something along
> those lines but I don't see it as a big deal either way really. You
> make a good point that having the error message there when reading the
> code is nice.

Thinking more on this, I think my vote will be to keep them duplicated in the
code for readability. Unless there are strong feelings against I think we at
least should start there.

>>> Maybe we should put a stake in the ground that says "we only support
>>> back to version X of NSS", test with that and a few more recent versions
>>> and the most recent, and then rip out anything that's needed for
>>> versions which are older than that?
>>
>> Yes, right now there is very little in the patch which caters for old versions,
>> the PR_Init call might be one of the few offenders. There has been discussion
>> upthread about settling for a required version, combining the insights learned
>> there with a survey of which versions are commonly available packaged.
>>
>> Once we settle on a version we can confirm if PR_Init is/isn't needed and
>> remove all traces of it if not.
>
> I don't really see this as all that hard to do- I'd suggest we look at
> what systems someone might reasonably deploy v14 on. To that end, I'd
> say "only systems which are presently supported", so: RHEL7+, Debian 9+,
> Ubuntu 16.04+.

Sounds reasonable.

> Looking at those, I see:
>
> Ubuntu 16.04: 3.28.4
> RHEL6: v3.28.4
> Debian: 3.26.2

I assume these have matching NSPR versions placing the Debian 9 NSPR package as
the lowest required version for that?

>>> I have a pretty hard time imagining that someone is going to want to build PG
>>> v14 w/ NSS 2.0 ...
>>
>> Let alone compiling 2.0 at all on a recent system..
>
> Indeed, and given the above, it seems entirely reasonable to make the
> requirement be NSS v3+, no? I wouldn't be against making that even
> tighter if we thought it made sense to do so.

I think anything but doing that would be incredibly unreasonable.

>>> Also- we don't seem to complain at all about a cipher being specified that we
>>> don't find? Guess I would think that we might want to throw a WARNING in such
>>> a case, but I could possibly be convinced otherwise.
>>
>> No, I think you're right, we should throw WARNING there or possibly even a
>> higher elevel. Should that be a COMMERROR even?
>
> I suppose the thought I was having was that we might want to allow some
> string that covered all the OpenSSL and NSS ciphers that someone feels
> comfortable with and we'd just ignore the ones that don't make sense for
> the particular library we're currently built with. Making it a
> COMMERROR seems like overkill and I'm not entirely sure we actually want
> any warning since we might then be constantly bleating about it.

Right, with a string like that we'd induce WARNING fatigue quickly. Catching
the case of *no* ciphers enabled with a COMMERROR is going some way towards
being helpful to the user in debugging the failed connection here.

>>>> +pg_ssl_read(PRFileDesc *fd, void *buf, PRInt32 amount, PRIntn flags,
>>>> + PRIntervalTime timeout)
>>>> +{
>>>> + PRRecvFN read_fn;
>>>> + PRInt32 n_read;
>>>> +
>>>> + read_fn = fd->lower->methods->recv;
>>>> + n_read = read_fn(fd->lower, buf, amount, flags, timeout);
>>>> +
>>>> + return n_read;
>>>> +}
>>>> +
>>>> +static PRInt32
>>>> +pg_ssl_write(PRFileDesc *fd, const void *buf, PRInt32 amount, PRIntn flags,
>>>> + PRIntervalTime timeout)
>>>> +{
>>>> + PRSendFN send_fn;
>>>> + PRInt32 n_write;
>>>> +
>>>> + send_fn = fd->lower->methods->send;
>>>> + n_write = send_fn(fd->lower, buf, amount, flags, timeout);
>>>> +
>>>> + return n_write;
>>>> +}
>>>> +
>>>> +static PRStatus
>>>> +pg_ssl_close(PRFileDesc *fd)
>>>> +{
>>>> + /*
>>>> + * Disconnect our private Port from the fd before closing out the stack.
>>>> + * (Debug builds of NSPR will assert if we do not.)
>>>> + */
>>>> + fd->secret = NULL;
>>>> + return PR_GetDefaultIOMethods()->close(fd);
>>>> +}
>>>
>>> Regarding these, I find myself wondering how they're different from the
>>> defaults..? I mean, the above just directly called
>>> PR_GetDefaultIOMethods() to then call it's close() function- are the
>>> fd->lower_methods->recv/send not the default methods? I don't quite get
>>> what the point is from having our own callbacks here if they just do
>>> exactly what the defaults would do (or are there actually no defined
>>> defaults and you have to provide these..?).
>>
>> It's really just to cope with debug builds of NSPR which assert that fd->secret
>> is null before closing.
>
> And we have to override the recv/send functions for this too..? Sorry,
> my comment wasn't just about the close() method but about the others
> too.

Ah, no we can ditch the .send and .recv functions and stick with the default
built-ins, I just confirmed this and removed them. I think they are leftovers
from when I injected debug code there during development, they were as you say
copies of the default.

>>>> + /*
>>>> + * Return the underlying PRFileDesc which can be used to access
>>>> + * information on the connection details. There is no SSL context per se.
>>>> + */
>>>> + if (strcmp(struct_name, "NSS") == 0)
>>>> + return conn->pr_fd;
>>>> + return NULL;
>>>> +}
>>>
>>> Is there never a reason someone might want the pointer returned by
>>> NSS_InitContext? I don't know that there is but it might be something
>>> to consider (we could even possibly have our own structure returned by
>>> this function which includes both, maybe..?). Not sure if there's a
>>> sensible use-case for that or not just wanted to bring it up as it's
>>> something I asked myself while reading through this patch.
>>
>> Not sure I understand what you're asking for here, did you mean "is there ever
>> a reason"?
>
> Eh, poor wording on my part. You're right, the question, reworded
> again, was "Would someone want to get the context returned by
> NSS_InitContext?". If we think there's a reason that someone might want
> that context then perhaps we should allow getting it, in addition to the
> pr_fd. If there's really no reason to ever want the context from
> NSS_InitContext then what you have here where we're returning pr_fd is
> probably fine.

I can't think of any reason, maybe Jacob who has been knee-deep in NSS contexts
have insights which tell a different story?

>>>> diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
>>>> index c601071838..7f10da3010 100644
>>>> --- a/src/interfaces/libpq/fe-secure.c
>>>> +++ b/src/interfaces/libpq/fe-secure.c
>>>> @@ -448,6 +448,27 @@ PQdefaultSSLKeyPassHook_OpenSSL(char *buf, int size, PGconn *conn)
>>>> }
>>>> #endif /* USE_OPENSSL */
>>>>
>>>> +#ifndef USE_NSS
>>>> +
>>>> +PQsslKeyPassHook_nss_type
>>>> +PQgetSSLKeyPassHook_nss(void)
>>>> +{
>>>> + return NULL;
>>>> +}
>>>> +
>>>> +void
>>>> +PQsetSSLKeyPassHook_nss(PQsslKeyPassHook_nss_type hook)
>>>> +{
>>>> + return;
>>>> +}
>>>> +
>>>> +char *
>>>> +PQdefaultSSLKeyPassHook_nss(PK11SlotInfo * slot, PRBool retry, void *arg)
>>>> +{
>>>> + return NULL;
>>>> +}
>>>> +#endif /* USE_NSS */
>>>
>>> Isn't this '!USE_NSS'?
>>
>> Technically it is, but using just /* USE_NSS */ is consistent with the rest of
>> blocks in the file.
>
> Hrmpf. I guess it seems a bit confusing to me to have to go find the
> opening #ifndef to realize that it's actally !USE_NSS.. In other words,
> I would think we'd actually want to fix all of these, heh. I only
> actually see one case on a quick grep where it's wrong for USE_OPENSSL
> and so that doesn't seem like it's really a precedent and is more of a
> bug. We certainly say 'not OPENSSL' in one place today too and also
> have a number of places where we have: #endif ... /* ! WHATEVER */.

No disagreement from me. To cut down the size of this patchset however I
propose that we tackle this separately and leave this as is in this thread
since it's in line with the rest of the file (for now).

>>>> Subject: [PATCH v30 2/9] Refactor SSL testharness for multiple library
>>>>
>>>> The SSL testharness was fully tied to OpenSSL in the way the server was
>>>> set up and reconfigured. This refactors the SSLServer module into a SSL
>>>> library agnostic SSL/Server module which in turn use SSL/Backend/<lib>
>>>> modules for the implementation details.
>>>>
>>>> No changes are done to the actual tests, this only change how setup and
>>>> teardown is performed.
>>>
>>> Presumably this could be committed ahead of the main NSS support?
>>
>> Correct, I think this has merits even if NSS support is ultimately rejected.
>
> Ok- could you break it out on to its own thread and I'll see about
> committing it soonish, to get it out of the way?

It was already on it's own thread, as we discussed offlist. I have since
rebased and expanded that patch over in that thread which has gotten review
that needs to be addressed. As such, I will not update that patch in the
series in this thread but keep the changes on that thread, and then pull them
back into here when ready.

>>>> Subject: [PATCH v30 5/9] nss: Documentation
>>>> +++ b/doc/src/sgml/acronyms.sgml
>>>> @@ -684,6 +717,16 @@
>>>> </listitem>
>>>> </varlistentry>
>>>>
>>>> + <varlistentry>
>>>> + <term><acronym>TLS</acronym></term>
>>>> + <listitem>
>>>> + <para>
>>>> + <ulink url="https://en.wikipedia.org/wiki/Transport_Layer_Security">
>>>> + Transport Layer Security</ulink>
>>>> + </para>
>>>> + </listitem>
>>>> + </varlistentry>
>>>
>>> We don't have this already..? Surely we should..
>>
>> We really should, especially since we've had <acronym>TLS</acronym> in
>> config.sgml since 2014 (c6763156589). That's another small piece that could be
>> committed on it's own to cut down the size of this patchset (even if only by a
>> tiny amount).
>
> Ditto on this. :)

Done in https://postgr.es/m/27109504-82DB-41A8-8E63-C0498314F5B0@yesql.se

>>>> @@ -1288,7 +1305,9 @@ include_dir 'conf.d'
>>>> connections using TLS version 1.2 and lower are affected. There is
>>>> currently no setting that controls the cipher choices used by TLS
>>>> version 1.3 connections. The default value is
>>>> - <literal>HIGH:MEDIUM:+3DES:!aNULL</literal>. The default is usually a
>>>> + <literal>HIGH:MEDIUM:+3DES:!aNULL</literal> for servers which have
>>>> + been built with <productname>OpenSSL</productname> as the
>>>> + <acronym>SSL</acronym> library. The default is usually a
>>>> reasonable choice unless you have specific security requirements.
>>>> </para>
>>>
>>> Shouldn't we say something here wrt NSS?
>>
>> We should, but I'm not entirely what just yet. Need to revisit that.
>
> Not sure if we really want to do this but at least with ssllabs.com,
> postgresql.org gets an 'A' rating with this set:
>
> ECDHE-ECDSA-CHACHA20-POLY1305
> ECDHE-RSA-CHACHA20-POLY1305
> ECDHE-ECDSA-AES128-GCM-SHA256
> ECDHE-RSA-AES128-GCM-SHA256
> ECDHE-ECDSA-AES256-GCM-SHA384
> ECDHE-RSA-AES256-GCM-SHA384
> DHE-RSA-AES128-GCM-SHA256
> DHE-RSA-AES256-GCM-SHA384
> ECDHE-ECDSA-AES128-SHA256
> ECDHE-RSA-AES128-SHA256
> ECDHE-ECDSA-AES128-SHA
> ECDHE-RSA-AES256-SHA384
> ECDHE-RSA-AES128-SHA
> ECDHE-ECDSA-AES256-SHA384
> ECDHE-ECDSA-AES256-SHA
> ECDHE-RSA-AES256-SHA
> DHE-RSA-AES128-SHA256
> DHE-RSA-AES128-SHA
> DHE-RSA-AES256-SHA256
> DHE-RSA-AES256-SHA
> ECDHE-ECDSA-DES-CBC3-SHA
> ECDHE-RSA-DES-CBC3-SHA
> EDH-RSA-DES-CBC3-SHA
> AES128-GCM-SHA256
> AES256-GCM-SHA384
> AES128-SHA256
> AES256-SHA256
> AES128-SHA
> AES256-SHA
> DES-CBC3-SHA
> !DSS
>
> Which also seems kinda close to what the default when built with OpenSSL
> ends up being? Thought the ssllabs report does list which ones it
> thinks are weak and so we might consider excluding those by default too:
>
> https://www.ssllabs.com/ssltest/analyze.html?d=postgresql.org&s=2a02%3a16a8%3adc51%3a0%3a0%3a0%3a0%3a50

Agreed, maintaining parity (or thereabouts) with OpenSSL defaults taking
industry best practices into account is probably what we should aim for.

>>>> @@ -1490,8 +1509,11 @@ include_dir 'conf.d'
>>>> <para>
>>>> Sets an external command to be invoked when a passphrase for
>>>> decrypting an SSL file such as a private key needs to be obtained. By
>>>> - default, this parameter is empty, which means the built-in prompting
>>>> - mechanism is used.
>>>> + default, this parameter is empty. When the server is using
>>>> + <productname>OpenSSL</productname>, this means the built-in prompting
>>>> + mechanism is used. When using <productname>NSS</productname>, there is
>>>> + no default prompting so a blank callback will be used returning an
>>>> + empty password.
>>>> </para>
>>>
>>> Maybe we should point out here that this requires the database to not
>>> require a password..? So if they have one, they need to set this, or
>>> maybe we should provide a default one..
>>
>> I've added a sentence on not using a password for the cert database. I'm not
>> sure if providing a default one is a good idea but it's no less insecure than
>> having no password really..
>
> I was meaning a default callback to prompt, not sure if that was clear.

Ah, no that's not what I thought you meant. Do you have any thoughts on what
that callback would look like? Take a password on a TTY input?

Below are a few fixes addressed from the original review email:

>>> + /*
>>> + * Set up the custom IO layer.
>>> + */
>>
>> Might be good to mention that the IO Layer is what sets up the
>> read/write callbacks to be used.
>
> Good point, will do in the next version of the patchset.

Fixed.

>>> + port->pr_fd = SSL_ImportFD(model, pr_fd);
>>> + if (!port->pr_fd)
>>> + {
>>> + ereport(COMMERROR,
>>> + (errmsg("unable to initialize")));
>>> + return -1;
>>> + }
>>
>> Maybe a comment and a better error message for this?
>
> Will do.

Fixed.

>>>
>>> + PR_Close(model);
>>
>> This might deserve one also, the whole 'model' construct is a bit
>> different. :)
>
> Agreed. will do.

Fixed.

>> Also, I get that they do similar jobs and that one is in the frontend
>> and the other is in the backend, but I'm not a fan of having two
>> 'ssl_protocol_version_to_nss()'s functions that take different argument
>> types but have exact same name and do functionally different things..
>
> Good point, I'll change that.

Fixed.

>>> + /*
>>> + * Configure cipher policy.
>>> + */
>>> + status = NSS_SetDomesticPolicy();
>>> + if (status != SECSuccess)
>>> + {
>>> + printfPQExpBuffer(&conn->errorMessage,
>>> + libpq_gettext("unable to configure cipher policy: %s"),
>>> + pg_SSLerrmessage(PR_GetError()));
>>> +
>>> + return PGRES_POLLING_FAILED;
>>> + }
>>
>> Probably good to pull over at least some parts of the comments made in
>> the backend code about SetDomesticPolicy() actually enabling everything
>> (just like all the policies apparently do)...
>
> Good point, will do.

Fixed.

>>> +int
>>> +be_tls_open_server(Port *port)
>>> +{
>>> + SECStatus status;
>>> + PRFileDesc *model;
>>> + PRFileDesc *pr_fd;
>>
>> pr_fd here is materially different from port->pr_fd, no? As in, one is
>> the NSS raw TCP fd while the other is the SSL fd, right? Maybe we
>> should use two different variable names to try and make sure they don't
>> get confused? Might even set this to NULL after we are done with it
>> too.. Then again, I see later on that when we do the dance with the
>> 'model' PRFileDesc that we just use the same variable- maybe we should
>> do that? That is, just get rid of this 'pr_fd' and use port->pr_fd
>> always?
>
> Hmm, I think you're right. I will try that for the next patchset version.

>> Similar comments to the backend code- should we just always use
>> conn->pr_fd? Or should we rename pr_fd to something else?
>
> Renaming is probably not a bad idea, will fix.


Both fixed.

Additionally, a few other off-list reported issues are also fixed in this
version (such as fixing the silly markup doc error and testplan off-by-one etc).

--
Daniel Gustafsson https://vmware.com/

Attachment Content-Type Size
v32-0009-nss-Build-infrastructure.patch application/octet-stream 21.3 KB
v32-0008-nss-Support-NSS-in-cryptohash.patch application/octet-stream 6.1 KB
v32-0007-nss-Support-NSS-in-sslinfo.patch application/octet-stream 3.6 KB
v32-0006-nss-Support-NSS-in-pgcrypto.patch application/octet-stream 24.9 KB
v32-0005-nss-Documentation.patch application/octet-stream 33.8 KB
v32-0004-nss-pg_strong_random-support.patch application/octet-stream 2.0 KB
v32-0003-nss-Add-NSS-specific-tests.patch application/octet-stream 52.2 KB
v32-0002-Refactor-SSL-testharness-for-multiple-library.patch application/octet-stream 11.5 KB
v32-0001-nss-Support-libnss-as-TLS-library-in-libpq.patch application/octet-stream 93.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2021-03-25 23:59:16 Re: Support for NSS as a libpq TLS backend
Previous Message Magnus Hagander 2021-03-25 22:52:58 Re: shared memory stats: high level design decisions: consistency, dropping