Quick Links

Re: Support for NSS as a libpq TLS backend

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc:	Julien Rouhaud <rjuju123(at)gmail(dot)com>, Jacob Champion <pchampion(at)vmware(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "hlinnaka(at)iki(dot)fi" <hlinnaka(at)iki(dot)fi>, "andrew(dot)dunstan(at)2ndquadrant(dot)com" <andrew(dot)dunstan(at)2ndquadrant(dot)com>, "michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "thomas(dot)munro(at)gmail(dot)com" <thomas(dot)munro(at)gmail(dot)com>, "sfrost(at)snowman(dot)net" <sfrost(at)snowman(dot)net>
Subject:	Re: Support for NSS as a libpq TLS backend
Date:	2022-01-26 23:59:39
Message-ID:	20220126235939.i2r3dlqrouz2ohga@alap3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2022-01-26 21:39:16 +0100, Daniel Gustafsson wrote:
> > What about
> > reconfiguring (note: add --enable-depend) the linux tasks to build against
> > nss, and then run the relevant subset of tests with it? Most tests don't use
> > tcp / SSL anyway, so rerunning a small subset of tests should be feasible?
>
> That's an interesting idea, I think that could work and be reasonably readable
> at the same time (and won't require in-depth knowledge of Cirrus). As it's the
> same task it does spend more time towards the max runtime per task, but that's
> not a problem for now. It's worth keeping in mind though if we deem this to be
> a way forward with testing multiple settings.

I think it's a way for a limited number of settings, that each only require a
limited amount of tests... Rerunning all tests etc is a different story.

> > Is it a great idea to have common/nss.h when there's a library header nss.h?
> > Perhaps we should have a pg_ssl_{nss,openssl}.h or such?
>
> That's a good point, I modelled it after common/openssl.h but I agree it's
> better to differentiate the filenames. I've renamed it to common/pg_nss.h and
> we should IMO rename common/openssl.h regardless of what happens to this patch.

> > Does this make some things notably more expensive? Presumably it does remove a
> > bunch of COW opportunities, but likely that's not a huge factor compared to
> > assymetric crypto negotiation...
>
> Right, the context of setting up crypto across a network connection it's highly
> likely to drown out the costs.

If you start to need to run a helper to decrypt an encrypted private key, and
do all the initialization, I'm not sure sure that holds true anymore... Have
you done any connection speed tests? pgbench -C is helpful for that.

> > Maybe soem of this commentary should migrate to the file header or such?
>
> Maybe, or perhaps README.ssl? Not sure where it would be most reasonable to
> keep it such that it's also kept up to date.

Either would work for me.

> >> This introduce
> >> + * differences with the OpenSSL support where some errors are only reported
> >> + * at runtime with NSS where they are reported at startup with OpenSSL.
> >
> > Found this sentence hard to parse somehow.
> >
> > It seems pretty unfriendly to only have minimal error checking at postmaster
> > startup time. Seems at least the presence and usability of keys should be done
> > *also* at that time?
>
> I'll look at adding some setup, and subsequent teardown, of NSS at startup
> during which we could do checking to be more on par with how the OpenSSL
> backend will report errors.

Cool.

> >> +/*
> >> + * raw_subject_common_name
> >> + *
> >> + * Returns the Subject Common Name for the given certificate as a raw char
> >> + * buffer (that is, without any form of escaping for unprintable characters or
> >> + * embedded nulls), with the length of the buffer returned in the len param.
> >> + * The buffer is allocated in the TopMemoryContext and is given a NULL
> >> + * terminator so that callers are safe to call strlen() on it.
> >> + *
> >> + * This is used instead of CERT_GetCommonName(), which always performs quoting
> >> + * and/or escaping. NSS doesn't appear to give us a way to easily unescape the
> >> + * result, and we need to store the raw CN into port->peer_cn for compatibility
> >> + * with the OpenSSL implementation.
> >> + */
> >
> > Do we have a testcase for embedded NULLs in common names?
>
> We don't, neither for OpenSSL or NSS. AFAICR Jacob spent days trying to get a
> certificate generation to include an embedded NULL byte but in the end gave up.
> We would have to write our own tools for generating certificates to add that
> (which may or may not be a bad idea, but it hasn't been done).

Hah, that's interesting.

> >> +/*
> >> + * pg_SSLShutdownFunc
> >> + * Callback for NSS shutdown
> >> + *
> >> + * If NSS is terminated from the outside when the connection is still in use
> >
> > What does "NSS is terminated from the outside when the connection" really
> > mean? Does this mean the client initiating something?
>
> If an extension, or other server-loaded code, interfered with NSS and managed
> to close contexts in order to interfere with connections this would ensure us
> closing it down cleanly.
>
> That being said, I was now unable to get my old testcase working so I've for
> now removed this callback from the patch until I can work out if we can make
> proper use of it. AFAICS other mature NSS implementations aren't using it
> (OpenLDAP did in the past but have since removed it, will look at how/why).

I think that'd be elog(FATAL) time if we want to do anything (after changing
state so that no data is sent to client).

> >> + /*
> >> + * The error cases for PR_Recv are not documented, but can be
> >> + * reverse engineered from _MD_unix_map_default_error() in the
> >> + * NSPR code, defined in pr/src/md/unix/unix_errors.c.
> >> + */
> >
> > Can we propose a patch to document them? Don't want to get bitten by this
> > suddenly changing...
>
> I can certainly propose something on their mailinglist, but I unfortunately
> wouldn't get my hopes up too high as NSS and documentation aren't exactly best
> friends (the in-tree docs doesn't cover the API and Mozilla recently removed
> most of the online docs in their neverending developer site reorg).

Kinda makes me question the wisdom of starting to depend on NSS. When openssl
docs are vastly outshining a library's, that library really should start to
ask itself some hard questions.

> In order to find a good split I think we need to figure what to optimize for;
> do we optimize for ease of reverting should that be needed, or along
> functionality borders, or something else? I don't have good ideas here, but a
> single 7596 insertions(+), 421 deletions(-) commit is clearly not a good idea.

I think the goal should be the ability to incrementally commit.

> Stephen had an idea off-list that we could look at splitting this across the
> server/client boundary, which I think is the only idea I've so far which has
> legs. (The first to go in would come with the common code of course.)

Yea, that's the most obvious one. I suspect client-side has a lower
complexity, because it doesn't need to replace quite as many things?

> The attached v53 incorporates the fixes discussed above, and builds green for
> both OpenSSL and NSS in Cirrus on my Github repo (thanks again for your work on
> those files) so it will be interesting to see the CFBot running them.

Looks like that worked...

Greetings,

Andres Freund

In response to

Re: Support for NSS as a libpq TLS backend at 2022-01-26 20:39:16 from Daniel Gustafsson

Responses

Re: Support for NSS as a libpq TLS backend at 2022-01-27 00:51:59 from Jacob Champion
Re: Support for NSS as a libpq TLS backend at 2022-01-28 14:08:09 from Daniel Gustafsson

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jacob Champion	2022-01-27 00:51:59	Re: Support for NSS as a libpq TLS backend
Previous Message	Greg Stark	2022-01-26 23:56:07	Re: autovacuum prioritization