Support for NSS as a libpq TLS backend

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Support for NSS as a libpq TLS backend
Date: 2020-05-15 20:46:09
Message-ID: FAB21FC8-0F62-434F-AA78-6BD9336D630A@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The attached patch implements NSS (Network Security Services) [0] with the
required NSPR runtime [1] as a TLS backend for PostgreSQL.

While all sslmodes are implemented and work for the most part, the patch is
*not* ready yet but I wanted to show progress early so that anyone interested
in this can help out with testing and maybe even hacking.

Why NSS? Well. It shares no lineage with OpenSSL making it not just an
alternative by fork but a 100% alternative. It's also actively maintained, is
readily available on many platforms where PostgreSQL is popular and has a FIPS
mode which doesn't require an EOL'd library. And finally, I was asked nicely
with the promise of a free beverage, an incentive as good as any.

Differences with OpenSSL
------------------------
NSS does not use certificates and keys on the filesystem, it instead uses a
certificate database in which all certificates, keys and CRL's are loaded. A
set of tools are provided to work with the database, like: certutil, crlutil,
pk12util etc. We could support plain PEM files as well, and load them into a
database ourselves but so far I've opted for just using what is already in the
database.

This does mean that new GUCs are needed to identify the database. I've mostly
repurposed the existing ones for cert/key/crl, but had to invent a new one for
the database. Maybe there should be an entirely new set? This needs to be
discussed with not only NSS in mind but for additional as-of-yet unknown
backends we might get (SChannel comes to mind).

NSS also supports partial chain validation per default (as do many other TLS
libraries) where OpenSSL does not. I haven't done anything about that just
yet, thus there is a failing test as a reminder to address it.

The documentation of NSS/NSPR is unfortunately quite poor and often times
outdated or simply nonexisting. Cloning the repo and reading the source code
is the only option for parts of the API.

Featurewise there might be other things we can make use of in NSS which doesn't
exist in OpenSSL, but for now I've tried to keep them aligned.

Known Bugs and Limitations (in this version of the patch)
---------------------------------------------------------
The frontend doesn't attempt to verify whether the specified CRL exists in the
database or not. This can be done with pretty much the same code as in the
backend, except that we don't have the client side certificate loaded so we
either need to read it back from the database, or parse a list of all CRLs
(which would save us from having the cert in local memory which generally is a
good thing to avoid).

pgtls_read is calling PR_Recv which works fine for communicating with an NSS
backend cluster, but hangs waiting for IO when communicating with an OpenSSL
backend cluster. Using PR_Read reverses the situation. This is probably a
simple bug but I haven't had time to track it down yet. The below shifts
between the two for debugging.

- nread = PR_Recv(conn->pr_fd, ptr, len, 0, PR_INTERVAL_NO_WAIT);
+ nread = PR_Read(conn->pr_fd, ptr, len);

Passphrase handling in the backend is broken, more on that under TODO.

There are a few failing tests and a few skipped ones for now, but the majority
of the tests pass.

Testing
-------
In order for the TAP framework to be able to handle backends with different
characteristics I've broken up SSLServer.pm into a set of modules:

SSL::Server
SSL::Backend::NSS
SSL::Backend::OpenSSL

The SSL tests import SSL::Server which in turn imports the appropriate backend
module in order to perform backend specific setup tasks. The backend used
should be transparent for the TAP code when it comes to switching server certs
etc.

So far I've used foo|bar in the matching regex to provide alternative output,
and SKIP blocks for tests that don't apply. There might be neater ways to
achieve this, but I was trying to avoid having separate test files for the
different backends.

The certificate databases can be created with a new nssfiles make target in
src/test/ssl, which use the existing files (and also depend on OpenSSL which I
don't think is a problematic dependency for development environments). To keep
it simple I've named the certificates in the NSS database after the filenames,
this isn't really NSS best-practices but it makes for an easier time reading
the tests IMO.

If this direction is of interest, extracting into to a separate patch for just
setting up the modules and implementing OpenSSL without a new backend is
probably the next step.

TODO
----
This patch is a work in progress, and there is work left to do, below is a dump
of what is left to fix before this can be considered a full implementation for
review. Most of these items have more documentation in the code comments.

* The split between init and open needs to be revisited, especially in frontend
where we have a bit more freedom. It remains to be seen if we can do better in
the backend part.

* Documentation, it's currently not even started

* Windows support. I've hacked mostly using Debian and have tested versions of
the patch on macOS, but not Windows so far.

* Figure out how to handle cipher configuration. Getting a set of ciphers that
result in a useable socket isn't as easy as with OpenSSL, and policies seem
much more preferred. At the very least this needs to be solidly documented.

* The rules in src/test/ssl/Makefile for generating certificate databases can
probably be generalized into a smaller set of rules based on wildcards.

* The password callback on server-side won't be invoked at server start due to
init happening in be_tls_open, so something needs to be figured out there.
Maybe attempt to open the database with a throw-away context in init just to
invoke the password callback?

* Identify code duplicated between frontend and backend and try to generalize.

* Make sure the handling the error codes correctly in the certificate and auth
callbacks to properly handle self-signed certs etc.

* Tidy up the tests which are partially hardwired for NSS now to make sure
there are no regressions for OpenSSL.

* All the code using OpenSSL which isn't the libpq communications parts, like
pgcrypto, strong_random, sha2, SCRAM et.al

* Review language in code comments and run pgindent et.al

* Settle on a minimum required version. I've been using NSS 3.42 and NSPR 4.20
simply since they were the packages Debian wanted to install for me, but I'm
quite convinced that we can go a bit lower (featurewise we can go much lower
but there are bugfixes in recent versions that we might want to include).
Anything lower than a version supporting TLSv1.3 seems like an obvious no-no.

I'd be surprised if this is all, but that's at least a start. There isn't
really a playbook on how to add a new TLS backend, but I'm hoping to be able to
summarize the required bits and pieces in README.SSL once this is a bit closer
to completion.

My plan is to keep hacking at this to have it reviewable for the 14 cycle, so
if anyone has an interest in NSS, then I would love to hear feedback on how it
works (and doesn't work).

The 0001 patch contains the full NSS support, and 0002 is a fix for the pgstat
abstraction which IMO leaks backend implementation details. This needs to go
on it's own thread, but since 0001 fails without it I've included it here for
simplicity sake for now.

cheers ./daniel

[0] https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS
[1] https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSPR

Attachment Content-Type Size
0001-WIP-Support-libnss-for-as-TLS-backend.patch application/octet-stream 129.9 KB
0002-Make-pg_stat_ssl-reporting-backend-agnostic.patch application/octet-stream 1.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2020-05-15 21:46:38 Re: Potentially misleading name of libpq pass phrase hook
Previous Message Christopher Baines 2020-05-15 20:30:26 [PATCH] Fix pg_dump --no-tablespaces for the custom format