Re: Support for NSS as a libpq TLS backend

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Jacob Champion <pchampion(at)vmware(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Support for NSS as a libpq TLS backend
Date: 2020-11-13 12:14:58
Message-ID: 7B5A88DF-863A-4B00-B841-0285C1FF70DF@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 12 Nov 2020, at 23:12, Jacob Champion <pchampion(at)vmware(dot)com> wrote:
>
> On Nov 11, 2020, at 10:57 AM, Jacob Champion <pchampion(at)vmware(dot)com> wrote:
>>
>> False alarm -- the stderr debugging I'd added in to track down the
>> assertion tripped up the "no stderr" tests. Zero failing tests now.
>
> I took a look at the OpenSSL interop problems you mentioned upthread.

Great, thanks!

> I don't see a hang like you did, but I do see a PR_IO_TIMEOUT_ERROR during
> connection.
>
> I think pgtls_read() needs to treat PR_IO_TIMEOUT_ERROR as if no bytes
> were read, in order to satisfy its API. There was some discussion on
> this upthread:
>
> On Oct 27, 2020, at 1:07 PM, Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
>>
>> On 20 Oct 2020, at 21:15, Andres Freund <andres(at)anarazel(dot)de> wrote:
>>>
>>>> + case PR_IO_TIMEOUT_ERROR:
>>>> + break;
>>>
>>> What does this mean? We'll return with a 0 errno here, right? When is
>>> this case reachable?
>>
>> It should, AFAICT, only be reachable when PR_Recv is used with a timeout which
>> we don't do. It mentioned somewhere that it had happened in no-wait calls due
>> to a bug, but I fail to find that reference now. Either way, I've removed it
>> to fall into the default error handling which now sets errno correctly as that
>> was a paddle short here.
>
> PR_IO_TIMEOUT_ERROR is definitely returned in no-wait calls on my
> machine. It doesn't look like the PR_Recv() API has a choice -- if
> there's no data, it can't return a positive integer, and returning zero
> means that the socket has been disconnected. So -1 with a timeout error
> is the only option.

Right, that makes sense.

> I'm not completely sure why this is exposed so easily with an OpenSSL
> server -- I'm guessing the implementation slices up its packets
> differently on the wire, causing a read event before NSS is able to
> decrypt a full record -- but it's worth noting that this case also shows
> up during NSS-to-NSS psql connections, when handling notifications at
> the end of every query. PQconsumeInput() reports a hard failure with the
> current implementation, but its return value is ignored by
> PrintNotifications(). Otherwise this probably would have showed up
> earlier.

Should there perhaps be an Assert there to catch those?

> (What's the best way to test this case? Are there lower-level tests for
> the protocol/network layer somewhere that I'm missing?)

Not AFAIK. Having been knee-deep now, do you have any ideas on how to
implement?

> While patching this case, I also noticed that pgtls_read() doesn't call
> SOCK_ERRNO_SET() for the disconnection case. That is also in the
> attached patch.

Ah yes, nice catch.

I've incorporated this patch as well as the previous patch for the assertion
failure on private callback data into the attached v19 patchset. I also did a
spellcheck and pgindent run on it for ease of review.

cheers ./daniel

Attachment Content-Type Size
v19-0001-NSS-Frontend-Backend-and-build-infrastructure.patch application/octet-stream 108.6 KB
v19-0002-NSS-Testharness-updates.patch application/octet-stream 49.5 KB
v19-0003-NSS-pg_strong_random-support.patch application/octet-stream 4.4 KB
v19-0004-NSS-Documentation.patch application/octet-stream 14.2 KB
v19-0005-NSS-contrib-modules.patch application/octet-stream 29.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-11-13 12:15:15 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Andrey Borodin 2020-11-13 11:49:36 Re: MultiXact\SLRU buffers configuration