Quick Links

Re: Rare SSL failures on eelpout

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Rare SSL failures on eelpout
Date:	2019-03-18 23:25:42
Message-ID:	CA+hUKGJowQypXSKsjws9A+nEQDD0-mExHZqFXtJ09N209rCO5A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Mar 19, 2019 at 9:11 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
> > ... I don't like pqHandleSendFailure all that much: it has strong
> > constraints on what state libpq has to be in, as a consequence of which
> > it's called from a bunch of ad-hoc places, and can't be called from
> > some others. It's kind of accidental that it will work here.
> > I was toying with the idea of getting rid of that function in
> > favor of a design like this:
> > * Handle any socket-write failure at some low level of libpq by
> > recording the fact that the error occurred (plus whatever data we
> > need to produce a message about it), and then starting to just
> > discard output data.
> > * Eventually, we'll try to read. Process any available input data
> > as usual (and, if we get a read error, report that). When no more
> > input data is available, if a socket write failure has been recorded,
> > report that much as if it were an incoming ERROR message, and then
> > shut down the connection.
> > This would essentially give us pqHandleSendFailure-like behavior
> > across the board, which might be enough to fix this problem as well as
> > bug #15598. Or not ... as you say, we haven't thoroughly understood
> > either issue, so it's possible this wouldn't do it.
>
> Here's a draft patch along that line. It's gratifyingly small, and
> it does fix the SSL problem for me. I have no way of investigating
> whether it fixes bug #15598, but it might.
>
> Aside from the SSL cert-verify issue, I've checked the behavior when
> the backend is shut down ("pg_ctl stop") between queries, and when
> the backend is kill 9'd mid-query. The shutdown case reacts well with
> or without SSL:
>
> regression=# select 2+2;
> FATAL: terminating connection due to administrator command
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
>
> The backend-crash case is fine without SSL:
>
> regression=# select * from tenk1 t1, tenk1 t2, tenk1 t3;
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Succeeded.
>
> but a bit less nice with it:
>
> regression=# select * from tenk1 t1, tenk1 t2, tenk1 t3;
> SSL SYSCALL error: EOF detected
> The connection to the server was lost. Attempting reset: Succeeded.
>
> That seems cosmetic --- we just haven't expended the same effort on
> what to print for connection EOF with SSL as without it. (Also,
> you get that same result without this patch.)
>
> Of course, this hardly counts as exhaustive testing, especially since
> I only tried one OpenSSL version.

Neat. Fixes the problem for me on eelpout's host.

> > One thing that isn't real clear to me is how much timing sensitivity
> > there is in "when no more input data is available". Can we assume that
> > if we've gotten ECONNRESET or an allied error from a write, then any
> > data the far end might've wished to send us is already available to
> > read?

Following a trail beginning at
https://en.wikipedia.org/wiki/Transmission_Control_Protocol I see that
RFC 1122 4.2.2.13 discusses this topic and possible variations in this
area. I don't know enough about any of this stuff to draw hard
conclusions from primary sources, but it does appear that an
implementation might be within its rights to jettison that data,
unfortunately. As seen with the simple Python script experiment
up-thread, we tested four implementations, and (amusingly) got four
different behaviours:

1. For Windows, the answer to your question was clearly "no" in that
experiment. I think this behaviour sucks for any protocol that
involves unsolicited messages in both directions (that is, not based
on pure request/response processing), because there is no way to
guarantee that you receive a "goodbye" message, and that's what's
happening in bug #15598. I'm not sure if there is anything we ever
can do about this though, other than complaining on the internet or
opening a second TCP connection for unsolicited server->client
messages.

2. Linux, FreeBSD and Darwin gave slightly different error sequences
when writing after the remote connection was closed (though I suspect
they'd behave much the same way for a connection to a remote host),
but all allowed the "goodbye" message to be read, so the answer there
is "yes".

> > In principle, since TCP allows shutting down either transmission
> > direction independently, the server could close the read side of its
> > socket while continuing to transmit data. In practice, PG servers
> > don't do that; but could network timing glitches create the same effect?
> > Even if it's possible, does it happen enough to worry about?
> > The reason I'm concerned is that I don't think it'd be bright to ignore a
> > send error until we see input EOF, which'd be the obvious way to solve a
> > timing problem if there is one. If our send error was some transient
> > glitch and the connection is really still open, then we might not get EOF,
> > but we won't get a server reply either because no message went to the
> > server. You could imagine waiting some small interval of time before
> > deciding that it's time to report the write failure, but ugh.
>
> As written, this patch does seem vulnerable to this concern, if it's real.
> We could address it for post-connection failures by tweaking PQgetResult
> so that it doesn't call pqWait if there's already a write error recorded;
> but I'm far from sure that that would be a net improvement. Also, making
> a comparable change to the behavior of the PQconnectPoll state machine
> might be tricky.

I think ignoring send errors until you eventually get a recv error or
an EOF is OK, because that's roughly equivalent to the way sending
works in TCP in general anyway; you find out that the other end has
gone away a bit later. That is the reason you can't reproduce this
problem on a FreeBSD box with OpenSSL 1.1.1 even if you add the sleep,
or (presumably, though I didn't test) on a Linux system with remote
server: the send() succeeds, because it doesn't or can't peek at the
other end's TCP state and see that it's been closed already, instead
it merely transmits the message and the other send later sends back a
RST bit that'll cause some future send() syscall to fail.

I can't imagine what sort of transient glitch could apply here. I
guess the question is: is there a way for send() to report an error,
but recv() to report EAGAIN so we finish up blocked? I can't
immediately see how that could happen.

> My current feeling is that this is OK to put in HEAD but I think the
> risk-reward ratio isn't very good for the back branches. Even with
> an OpenSSL version where this makes a difference, the problematic
> behavior is pretty hard to hit. So I'm a bit inclined to do nothing
> in the back branches.

Shouldn't we also back-patch the one-line change adding
pqHandleSendFailure()? Then eelpout would be green for REL_11_STABLE,
and any future buildfarm animals (probably underpowered ones with
wonky scheduling) when they are eventually upgraded to OpenSSL 1.1.1.

--
Thomas Munro
https://enterprisedb.com

In response to

Re: Rare SSL failures on eelpout at 2019-03-18 20:11:30 from Tom Lane

Responses

Re: Rare SSL failures on eelpout at 2019-03-18 23:44:37 from Tom Lane
Re: Rare SSL failures on eelpout at 2019-03-19 00:14:44 from Thomas Munro
Re: Rare SSL failures on eelpout at 2019-03-19 19:31:56 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Geoghegan	2019-03-18 23:34:43	Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Previous Message	Andres Freund	2019-03-18 23:24:40	Re: What to name the current heap after pluggable storage / what to rename?