|From:||Daniel Gustafsson <daniel(at)yesql(dot)se>|
|To:||Jacob Champion <pchampion(at)vmware(dot)com>|
|Cc:||Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>|
|Subject:||Re: Support for NSS as a libpq TLS backend|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
> On 12 Nov 2020, at 23:12, Jacob Champion <pchampion(at)vmware(dot)com> wrote:
> On Nov 11, 2020, at 10:57 AM, Jacob Champion <pchampion(at)vmware(dot)com> wrote:
>> False alarm -- the stderr debugging I'd added in to track down the
>> assertion tripped up the "no stderr" tests. Zero failing tests now.
> I took a look at the OpenSSL interop problems you mentioned upthread.
> I don't see a hang like you did, but I do see a PR_IO_TIMEOUT_ERROR during
> I think pgtls_read() needs to treat PR_IO_TIMEOUT_ERROR as if no bytes
> were read, in order to satisfy its API. There was some discussion on
> this upthread:
> On Oct 27, 2020, at 1:07 PM, Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
>> On 20 Oct 2020, at 21:15, Andres Freund <andres(at)anarazel(dot)de> wrote:
>>>> + case PR_IO_TIMEOUT_ERROR:
>>>> + break;
>>> What does this mean? We'll return with a 0 errno here, right? When is
>>> this case reachable?
>> It should, AFAICT, only be reachable when PR_Recv is used with a timeout which
>> we don't do. It mentioned somewhere that it had happened in no-wait calls due
>> to a bug, but I fail to find that reference now. Either way, I've removed it
>> to fall into the default error handling which now sets errno correctly as that
>> was a paddle short here.
> PR_IO_TIMEOUT_ERROR is definitely returned in no-wait calls on my
> machine. It doesn't look like the PR_Recv() API has a choice -- if
> there's no data, it can't return a positive integer, and returning zero
> means that the socket has been disconnected. So -1 with a timeout error
> is the only option.
Right, that makes sense.
> I'm not completely sure why this is exposed so easily with an OpenSSL
> server -- I'm guessing the implementation slices up its packets
> differently on the wire, causing a read event before NSS is able to
> decrypt a full record -- but it's worth noting that this case also shows
> up during NSS-to-NSS psql connections, when handling notifications at
> the end of every query. PQconsumeInput() reports a hard failure with the
> current implementation, but its return value is ignored by
> PrintNotifications(). Otherwise this probably would have showed up
Should there perhaps be an Assert there to catch those?
> (What's the best way to test this case? Are there lower-level tests for
> the protocol/network layer somewhere that I'm missing?)
Not AFAIK. Having been knee-deep now, do you have any ideas on how to
> While patching this case, I also noticed that pgtls_read() doesn't call
> SOCK_ERRNO_SET() for the disconnection case. That is also in the
> attached patch.
Ah yes, nice catch.
I've incorporated this patch as well as the previous patch for the assertion
failure on private callback data into the attached v19 patchset. I also did a
spellcheck and pgindent run on it for ease of review.
|Next Message||Amit Kapila||2020-11-13 12:15:15||Re: [HACKERS] logical decoding of two-phase transactions|
|Previous Message||Andrey Borodin||2020-11-13 11:49:36||Re: MultiXact\SLRU buffers configuration|