Re: Add client connection check during the execution of the query

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
Cc: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add client connection check during the execution of the query
Date: 2019-07-18 01:22:34
Message-ID: CA+hUKGJ3-N0F7T_tcRnGZv0_g_DJwcYM2=XTYCHrvcFG2YrrWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 6, 2019 at 12:27 AM Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru> wrote:
> Well, indeed in case of cable disconnect only way to detect it with
> proposed approach is to have tcp keepalive. However if disconnection
> happens due to client application shutdown then client OS should itself
> properly close than connection and therefore this patch will detect
> such situation without keepalives configured.

Yeah.

+1 for this patch, with a few adjustments including making the test
use pg_sleep() as mentioned. It does something useful, namely
cancelling very long running queries sooner if the client has gone
away instead of discovering that potentially much later when sending a
response. It does so with a portable kernel interface (though we
haven't heard from a Windows tester), and I think it's using it in a
safe way (we're not doing the various bad things you can do with
MSG_PEEK, and the fd is expected to be valid for the process's
lifetime, and the socket is always in non-blocking mode*, so I don't
think there is any bad time for pg_check_client_connection() to run).
It reuses the existing timer infrastructure so there isn't really any
new overhead. One syscall every 10 seconds or whatever at the next
available CFI is basically nothing. On its own, this patch will
reliably detect clients that closed abruptly or exited/crashed (so
they client kernel sends a FIN packet). In combination with TCP
keepalive, it'll also detect clients that went away because the
network or client kernel ceased to exist.

*There are apparently no callers of pg_set_block(), so if you survived
pq_init() you have a non-blocking socket. If you're on Windows, the
code always sets the magic pgwin32_noblock global flag before trying
to peek. I wondered if it's OK that the CFI would effectively clobber
that with 0 on its way out, but that seems to be OK because every
place in the code that sets that flag does so immediately before an IO
operationg without a CFI in between. As the comment in pgstat.c says
"This is extremely broken and should be fixed someday.". I wonder if
we even need that flag at all now that all socket IO is non-blocking.

--
Thomas Munro
https://enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2019-07-18 02:18:11 Re: d25ea01275 and partitionwise join
Previous Message Andres Freund 2019-07-18 01:20:18 Re: Custom table AMs need to include heapam.h because of BulkInsertState