Escaping from blocked send() reprised.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Escaping from blocked send() reprised.
Date: 2014-06-30 08:13:47
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello, I have received inquiries related to blocked communication
several times for these weeks with different symptoms. Then I
found this message from archive,

> Subject: Escaping a blocked sendto() syscall without causing a restart

Mr. Tom Lane gave a comment replying it,

> Offhand it looks to me like most signals would kick the backend off the
> send() call ... but it would loop right back and try again. See
> internal_flush() in pqcomm.c. (If you're using SSL, this diagnosis
> may or may not apply.)
> We can't do anything except repeat the send attempt if the client
> connection is to be kept in a sane state.
> And I'm not at all sure if we could get it to work in SSL mode...

That's true for timeouts that should continue the connection,
say, statement_timeout, but focusing on intentional backend
termination, I think it does no harm to break it up abruptly,
even if it was on SSL. On the other hand it seems still
preferable to keep a connection when not blocked. The following
expression would detects such a blocking state at just before
next send(2) after the previous try exited by signals.

(ProcDiePending && select(1, NULL, fd, NULL, '1 sec') == 0)

Finally, pg_terminate_backend() works even when send is blocked
for both SSL and non-SSL connections after 1 second delay with
this patch (break_socket_blocking_on_termination_v1.patch).

Nevetheless, of course statement_timeout cannot become effective
by this method since it breaks the consistency in the client
protocol. It needs change in client protocol to have "out of
band" mechanism or something, maybe.

Any suggestions?

Attached patches are:

- break_socket_blocking_on_termination_v1.patch : The patch to
break blocked state of send(2) for pg_terminate_backend().

- socket_block_test.patch : debug printing and changing send
buffer of libpq for reproducing the blocked situation.

Some point of discussion follows,

==== Discussion about the appropriateness of looking into
ProcDiePending there and calling CHECK_FOR_INTERRUPTS()
seeing it.

I have somewhat uneasiness of these things, but what we can at
most seems to be replacing ProcDiePending here with some another
variable, say ImmediatelyExitFromBlockedState, and somehow go
upstairs through normal return path. Additional Try-Catch seems
can do that but it looks no benefit for the added complexity..

==== Discussion on breaking up connetion especially for SSL

Breaking an SSL connection up in my_sock_write() cause the
following message on client side if it still lives and resumes to
receive from the connection, this seems to show that the client
handles the event properly.

| SSL SYSCALL error: EOF detected
| The connection to the server was lost. Attempting reset: Succeeded.

==== Discussion on reliability of select(2)

This method is not a perfect solution, since the select(2)
sometimes gives a wrong oracle about wheather the follwing
send(2) will be blocked.

Even so, as far as I see, select(2) just after exiting from
blocked send(2) by signal seems always says 'write will be
blocked', so what this patch does seems to save most cases except
when the any amount of socket buffer is vacated just before the
following select. The second chance to exit from blocked send(2)
won't come after this(, before one more pg_terminate_backend() ?).

Removing the select(2) from the condition (that is,
CHECK_FOR_INTERRUPTS() is called always ProcDiePending is true)
prevents such a possibility, in exchange for killing 'really
live' connection but IMHO it's no problem on intentional server

More reliable measure for this would be non-blocking IO but it
changes more of the code.

==== Reproducing the situation.

Another possible question would be about the possibility of such
blocking, or how to reproduce the situation. I found that send(2)
on CentOS6.5 somehow sends successfully, for most cases, the
remaining data at the retry after exiting by signal during being
blocked with buffer full, in spite of no change in environment.

So reproducing the stucked situation is rather difficult on the
server as is. But such situation would be reproduced quite easily
with some cheat, that is, enlarging PQ send buffer, say by ten

Applying the attached testing patch (socket_block_test.patch),
the following steps will make the stucked situation.

1. Do a select which returns big result and enter Ctrl-Z just
after invoking.

cl> $ psql -h localhost postgres
cl> postgres=# select 1 from generate_series(0, 9999999);
cl> ^Z
cl> [4]+ Stopped psql -h localhost postgres

2. Watch the server to stuck.

The server starts to print lines like following to console
after a while, then stops. The number enclosed by the square
brackets is PID of the server.

sv> #### [8809] (bare) rest = 0 / 81920 bytes, ProcDiePending = 0

3. Do pg_terminate_backend().

cl> $ psql postgres -c "select pg_terminate_backend(8809)"

The server will stuck like follows, PID=8811 is the another
session made by the command just above.

sv> #### [8809] (bare) rest = 0 / 81920 bytes, ProcDiePending = 0
sv> #### [8811] (bare) rest = 0 / 327 bytes, ProcDiePending = 0
sv> #### [8809] (bare) rest = 500 / 81920 bytes, ProcDiePending = 1
sv> #### [8811] (bare) rest = 0 / 78 bytes, ProcDiePending = 0

The server 8809 is blocked during sending the remaining 500
bytes and won't come back forever except for SIGKILL, or
possible data reading on the client (fg does it).

cl> $ fg

sv> #### [8809] (bare) rest = 0 / 500 bytes, ProcDiePending = 1
sv> FATAL: terminating connection due to administrator command
sv> STATEMENT: select 1 from generate_series(0, 9999999);
sv> #### [8809] (bare) rest = 0 / 116 bytes, ProcDiePending = 0
sv> #### [8883] (bare) rest = 0 / 327 bytes, ProcDiePending = 0

If you don't see the situation to occur, changing the value of
select clause (by its length, not by value:) would be
effective, or entering Ctrl-Z after some debug output also
would be effective.

For SSL connections, the debug output looks like the following,

sv> #### [20064] (bare) rest = 0 / 81920 bytes, ProcDiePending = 0
sv> #### [20064] (SSL) rest = 0 / 16413 bytes, ProcDiePending = 0
sv> #### [20064] (SSL) rest = 0 / 16413 bytes, ProcDiePending = 0
sv> #### [20064] (SSL) rest = 0 / 16413 bytes, ProcDiePending = 0
sv> #### [20064] (SSL) rest = 980 / 16413 bytes, ProcDiePending = 1
sv> #### [20064] (SSL) rest = 0 / 980 bytes, ProcDiePending = 1
sv> #### [20064] (SSL) rest = 1029 / 16413 bytes, ProcDiePending = 1

"bare" here in turn means the status of SSL_write and "SSL"
means the status of the underlying 'bare' socket of SSL
connection. (Sorry for the confising labelings..)

The (bare) line above is not corresponding to the following
bunch of (SSL) lines, but its precedents. At the fifth line,
send(2) exits by signal issued by pg_teminate_backend() then
retry (somehow) successfully at sixth line but SSL layer gave
another 16413 bytes and only 1029 bytes of that is sent by the
first try and the server stucked at the second try for the
seventh line. The control doesn't come back to secure_write()
during this sequence.


Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
break_socketblocking_on_termination_v1.patch text/x-patch 2.0 KB
socket_block_test.patch text/x-patch 1.3 KB


Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2014-06-30 08:14:54 Re: How about a proper TEMPORARY TABLESPACE?
Previous Message Michael Paquier 2014-06-30 07:43:38 Re: IMPORT FOREIGN SCHEMA statement