SSL connections don't cope with server crash very well at all

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: SSL connections don't cope with server crash very well at all
Date: 2008-01-28 01:09:10
Message-ID: 15808.1201482550@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

If you do a manual "kill -9" (for testing purposes) on its connected
server process, psql normally recovers nicely:

regression=# select 1;
?column?
----------
1
(1 row)

-- issue kill here in another window
regression=# select 1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
regression=#

But try it with an SSL-enabled connection, and psql just dies rudely.
Investigation shows that it's being killed by SIGPIPE while attempting
to clean up the failed connection:

Program received signal SIGPIPE, Broken pipe.
0x00000030f7ec6e80 in __write_nocancel () from /lib64/libc.so.6
(gdb) bt
#0 0x00000030f7ec6e80 in __write_nocancel () from /lib64/libc.so.6
#1 0x0000003102497a27 in rl_filename_completion_function ()
from /lib64/libcrypto.so.6
#2 0x0000003102495e5e in BIO_write () from /lib64/libcrypto.so.6
#3 0x0000003877a1f449 in ssl3_write_pending () from /lib64/libssl.so.6
#4 0x0000003877a1f8b6 in ssl3_dispatch_alert () from /lib64/libssl.so.6
#5 0x0000003877a1d602 in ssl3_shutdown () from /lib64/libssl.so.6
#6 0x00002aaaaaac2675 in close_SSL (conn=0x642d60) at fe-secure.c:1095
#7 0x00002aaaaaabb483 in pqReadData (conn=0x642d60) at fe-misc.c:719
#8 0x00002aaaaaaba9b8 in PQgetResult (conn=0x642d60) at fe-exec.c:1223
#9 0x00002aaaaaabaa8e in PQexecFinish (conn=0x642d60) at fe-exec.c:1452
#10 0x00000000004075b7 in SendQuery (query=<value optimized out>)
at common.c:853
#11 0x0000000000409cf3 in MainLoop (source=0x30f8151680) at mainloop.c:225
#12 0x000000000040c3dc in main (argc=<value optimized out>, argv=0x100)
at startup.c:352

Apparently we need to do the SIGPIPE disable/enable dance around
SSL_shutdown() as well as SSL_write(). I wonder whether we don't need
it around SSL_read() as well --- I seem to recall that OpenSSL might
either read or write the socket within SSL_read(), due to various corner
cases in the SSL protocol.

Comments?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Neil Conway 2008-01-28 01:34:27 Re: Proposed patch: synchronized_scanning GUC variable
Previous Message Tom Lane 2008-01-28 00:37:42 Re: Proposed patch: synchronized_scanning GUC variable