Re: MSVC SSL test failure

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Lars Kanis <lars(at)greiz-reinsdorf(dot)de>
Subject: Re: MSVC SSL test failure
Date: 2021-12-07 07:00:00
Message-ID: 1d72bc2e-c4fa-e476-b31b-daaee51166f6@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

06.12.2021 23:51, Andrew Dunstan wrote:
> I have been getting 100% failures on the SSL tests with closesocket()
> alone, and 100% success over 10 tests with this:
>
>
> diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
> index 96ab37c7d0..5998c089b0 100644
> --- a/src/backend/libpq/pqcomm.c
> +++ b/src/backend/libpq/pqcomm.c
> @@ -295,6 +295,7 @@ socket_close(int code, Datum arg)
>          * Windows too.  But it's a lot more fragile than the other way.
>          */
>  #ifdef WIN32
> +       shutdown(MyProcPort->sock, SD_SEND);
>         closesocket(MyProcPort->sock);
>  #endif
>
>
> That said, your results are quite worrying.
My next results are following:
It seems that the test failure rate may depend on the specs/environment.
With close-only version, having limited CPU usage for my Windows VM to
20%, I've got failures on iterations 10, 2, 1.
With 100% CPU I've seen 20 successful runs, then fails on iterations 5,
2. clean&buid and then failed iterations 11, 6, 3.  (So maybe caching is
another factor.)

shutdown(MyProcPort->sock, SD_SEND) apparently fixes the issue, I've got
83 successful runs, but then iteration 84 unfortunately failed:
t/001_ssltests.pl .. 106/110
#   Failed test 'intermediate client certificate is missing: matches'
#   at t/001_ssltests.pl line 608.
#                   'psql: error: connection to server at "127.0.0.1",
port 63187 failed: could not receive data from server: Software caused
connection abort (0x00002745/10053)
# SSL SYSCALL error: Software caused connection abort (0x00002745/10053)
# could not send startup packet: No error (0x00000000/0)'
#     doesn't match '(?^:SSL error: tlsv1 alert unknown ca)'
# Looks like you failed 1 test of 110.
t/001_ssltests.pl .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/110 subtests
        (less 2 skipped subtests: 107 okay)

It's not that one that we observed with close-only fix, but it still
worrying. And then exactly this fail occurred again, on iteration 8.

But "fortunately" I've got the same fail as before:
t/001_ssltests.pl .. 106/110
#   Failed test 'certificate authorization fails with revoked client
cert with server-side CRL directory: matches'
#   at t/001_ssltests.pl line 618.
#                   'psql: error: connection to server at "127.0.0.1",
port 59220 failed: server closed the connection unexpectedly
#       This probably means the server terminated abnormally
#       before or while processing the request.
# server closed the connection unexpectedly
#       This probably means the server terminated abnormally
#       before or while processing the request.
# server closed the connection unexpectedly
#       This probably means the server terminated abnormally
#       before or while processing the request.'
#     doesn't match '(?^:SSL error: sslv3 alert certificate revoked)'
# Looks like you failed 1 test of 110.
t/001_ssltests.pl .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/110 subtests
        (less 2 skipped subtests: 107 okay)
on 145-th iteration of the test even without close() (I've tried to
check whether the aforementioned fail existed before the fix).

So probably we found a practical evidence of shutdown() importance we
missed before, but it's not the end.
There was some test instability even without the close() fix and it
remains with the shutdown(...SD_SEND).

By the way, while exploring openssl' behavior, I found that
SSL_shutdown() has it's own quirks (see [1], return value 0). Maybe now
we've encountered one of these.

Best regards,
Alexander

[1] https://www.openssl.org/docs/man3.0/man3/SSL_shutdown.html

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hans Buschmann 2021-12-07 08:05:32 AW: Assorted improvements in pg_dump
Previous Message tanghy.fnst@fujitsu.com 2021-12-07 06:47:41 RE: row filtering for logical replication