BUG #16678: The ecpg connect/test5 test sometimes fails on Windows

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: exclusion(at)gmail(dot)com
Subject: BUG #16678: The ecpg connect/test5 test sometimes fails on Windows
Date: 2020-10-19 17:00:01
Message-ID: 16678-253e48d34dc0c376@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 16678
Logged by: Alexander Lakhin
Email address: exclusion(at)gmail(dot)com
PostgreSQL version: 13.0
Operating system: Windows
Description:

When executing `call vcregress ecpgcheck` (under a heavy load), I get
connect/test5 failures sometimes:
test connect/test4 ... ok 67 ms
test connect/test5 ... stderr FAILED 945 ms
test pgtypeslib/dt_test ... ok 271 ms

regression.diffs contains:
diff -w -U3 .../src/interfaces/ecpg/test/expected/connect-test5.stderr
.../src/interfaces/ecpg/test/results/connect-test5.stderr
--- .../src/interfaces/ecpg/test/expected/connect-test5.stderr 2020-10-13
21:51:14 +0300
+++ .../src/interfaces/ecpg/test/results/connect-test5.stderr 2020-10-19
19:31:41 +0300
@@ -36,7 +36,9 @@
[NO_PID]: sqlca: code: 0, state: 00000
[NO_PID]: ECPGconnect: opening database <DEFAULT> on <DEFAULT> port
<DEFAULT> for user regress_ecpg_user2
[NO_PID]: sqlca: code: 0, state: 00000
-[NO_PID]: ECPGconnect: could not open database: FATAL: database
"regress_ecpg_user2" does not exist
+[NO_PID]: ECPGconnect: could not open database: server closed the
connection unexpectedly
+ This probably means the server terminated abnormally
+ before or while processing the request.

[NO_PID]: sqlca: code: 0, state: 00000
[NO_PID]: ecpg_finish: connection main closed

postmaster.log contains:
...
2020-10-19 19:31:40.150 MSK client backend[928] ecpg/connect-test4 LOG:
could not receive data from client: An existing connection was forcibly
closed by the remote host.

2020-10-19 19:31:40.677 MSK client backend[2032] [unknown] FATAL: database
"regress_ecpg_user2" does not exist
2020-10-19 19:31:41.052 MSK client backend[2532] [unknown] FATAL: database
"regress_ecpg_user2" does not exist
2020-10-19 19:31:49.834 MSK client backend[2544] ecpg/preproc-whenever
ERROR: relation "nonexistent" does not exist at character 15
...
So the server didn't crash, it logged the expected message correctly.

The following delay added to pqReadData (src/interfaces/libpq/fe-misc.c):
@@ -662,6 +662,8 @@ pqReadData(PGconn *conn)
}
}

+ pg_usleep(10000L); /* wait for 10 msec */
+
/* OK, try to read some data */
retry3:
nread = pqsecure_read(conn, conn->inBuffer + conn->inEnd,

Makes the failure reproduce reliably (on each run) for me.
Also this affects the pg_dump test:
vcregress taptest src/bin/pg_dump
...
t/001_basic.pl ................ ok
t/002_pg_dump.pl .............. 1/6315
# Failed test 'connecting to a non-existent database: matches'
# at t/002_pg_dump.pl line 3460.
# 'pg_dump: error: connection to database "qqq" failed:
server
closed the connection unexpectedly
# This probably means the server terminated abnormally
# before or while processing the request.
# '
# doesn't match '(?^:pg_dump\:\ error\:\ connection\ to\ database\
\"qqq\"\
failed\:\ FATAL\:\ \ database\ \"qqq\"\ does\ not\ exist)'
t/002_pg_dump.pl .............. 6113/6315 # Looks like you failed 1 test of
6315
.
t/002_pg_dump.pl .............. Dubious, test returned 1 (wstat 256,
0x100)
Failed 1/6315 subtests
t/003_pg_dump_with_server.pl .. ok
...

On Linux the same delay (and even 100 msec) doesn't cause these test
failures.
It seems that on Windows the postmaster process can close a connection
before slow client receives a FATAL message.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2020-10-19 18:47:01 Re: Postgres 13 signal 11: Segmentation fault tested on 2 independent machines
Previous Message Tom Lane 2020-10-19 14:31:55 Re: BUG #16673: Stack depth limit exceeded error while running sysbench TPC-C