Re: Rare SSL failures on eelpout

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Rare SSL failures on eelpout
Date: 2019-03-05 18:05:55
Message-ID: CA+hUKGJDOLkCcuT3q4Ofu8Ojo9n4PNKsUc1108tv=r1=36bbeQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 6, 2019 at 6:07 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > You can see that poll() already knew the other end had closed the
> > socket. Since this is clearly timing... let's see, yeah, I can make
> > it fail every time by adding sleep(1) before the comment "Send the
> > startup packet.". I assume that'll work on any Linux machine?
>
> Great idea, but no cigar --- doesn't do anything for me except make
> the ssl test really slow. (I tried it on RHEL6 and Fedora 28 and, just
> for luck, current macOS.) What this seems to prove is that the thing
> that's different about eelpout is the particular kernel it's running,
> and that that kernel has some weird timing behavior in this situation.
>
> I've also been experimenting with reducing libpq's SO_SNDBUF setting
> on the socket, with more or less the same idea of making the sending
> of the startup packet slower. No joy there either.
>
> Annoying. I'd be happier about writing code to fix this if I could
> reproduce it :-(

Hmm. Note that eelpout only started doing it with OpenSSL 1.1.1. But
I just tried the sleep(1) trick on an x86 box running the same version
of Debian, OpenSSL etc and it didn't work. So eelpout (a super cheap
virtualised 4-core ARMv8 system rented from scaleway.com running
Debian Buster with a kernel identifying itself as 4.9.23-std-1 and
libc6 2.28-7) is indeed starting to look pretty weird. Let me know if
you want to log in and experiment on that machine.

--
Thomas Munro
https://enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-03-05 18:16:50 Re: Ordered Partitioned Table Scans
Previous Message Andres Freund 2019-03-05 17:59:40 Re: Inheriting table AMs for partitioned tables