Re: Add client connection check during the execution of the query

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Maksim Milyutin <milyutinma(at)gmail(dot)com>
Cc: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Zhihong Yu <zyu(at)yugabyte(dot)com>
Subject: Re: Add client connection check during the execution of the query
Date: 2021-12-11 04:41:34
Message-ID: CA+hUKGJeRMDNvALH2xq1wS1BHmFs93P83jatE_AD5tm8=94ryA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 12, 2021 at 3:10 AM Maksim Milyutin <milyutinma(at)gmail(dot)com> wrote:
> Good work. I have tested your patch on Linux and FreeBSD on three basic
> cases: client killing, cable breakdown (via manipulations with firewall)
> and silent closing client connection before completion of previously
> started query in asynchronous manner. And all works fine.

Thanks for the testing and review!

> + WaitEvent events[3];
>
> 3 is looks like as magic constant. We might to specify a constant for
> all event groups in FeBeWaitSet.

Yeah. In fact, we really just need one event. Fixed.

> + ModifyWaitEvent(FeBeWaitSet, FeBeWaitSetLatchPos, WL_LATCH_SET, NULL);
> + ModifyWaitEvent(FeBeWaitSet, FeBeWaitSetSocketPos,
> WL_SOCKET_CLOSED, NULL);
> + rc = WaitEventSetWait(FeBeWaitSet, 0, events, lengthof(events), 0);
> + ModifyWaitEvent(FeBeWaitSet, FeBeWaitSetLatchPos, WL_LATCH_SET,
> MyLatch);
>
> AFAICS, side effect to
> (FeBeWaitSet->events[FeBeWaitSetSocketPos]).events by setting
> WL_SOCKET_CLOSED value under calling of pq_check_connection() function
> doesn't have negative impact later, does it? That is, all
> WaitEventSetWait() calls have to setup socket events on its own from
> scratch.

Correct: every site that waits for FeBeWaitSet explicitly modifies the
socket event to say what it's waiting for (and that is often a no-op
internally), so we don't have to worry about restoring the previous
state. I've added a comment about that. We should work harder to
restore the latch than my previous patch did, though. Now I'm using a
PG_FINALLY() block.

I'm hoping to push this soon, after another round of testing, if
there's no more feedback.

There is one more OS that could be added, but I'll leave it out of the
initial commit, pending further investigation. Since I recently had
to set up a Windows dev VM up to test some other portability stuff, I
had a chance to test the FD_CLOSE hypothesis. You just have to do
this to enable it:

@@ -2069,6 +2069,7 @@ WaitEventSetCanReportClosed(void)
{
#if (defined(WAIT_USE_POLL) && defined(POLLRDHUP)) || \
defined(WAIT_USE_EPOLL) || \
+ defined(WAIT_USE_WIN32) || \
defined(WAIT_USE_KQUEUE)
return true;
#else

It seems to work! I'm not sure why it works, or whether we can count
on it, though. These sentences from the documentation[1] seem to
contract each other:

"FD_CLOSE being posted after all data is read from a socket. An
application should check for remaining data upon receipt of FD_CLOSE
to avoid any possibility of losing data."

My test says that the first sentence is wrong, but the second doesn't
exactly say that it has reliable POLLRDHUP nature, and I haven't found
one that does, yet. Perhaps we can convince ourselves of that in
follow-up work.

For the record, I tested two scenarios. The client was a Unix system,
the server a Windows 10 VM.

1. Connecting with psql and running "SELECT pg_sleep(60)" and then
killing the psql process. I'm not surprised that this one worked; it
would work if we tested for WL_SOCKET_READABLE too, but we already
decided that's not good enough.
2. Connecting from a C program that does PQsendQuery(conn, "SELECT
pg_sleep(60)") and then immediately PQfinish(conn), to test whether
the FD_CLOSE event is reported even though there is an unconsumed 'X'
in the socket. I wouldn't want to ship the feature on an OS where
this case doesn't get reported, or doesn't get reported sometimes,
because it'd be unreliable and unlike the behaviour on other OSes.
But it worked for me.

[1] https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsaeventselect

Attachment Content-Type Size
v3-0001-Add-WL_SOCKET_CLOSED-for-socket-shutdown-events.patch text/x-patch 7.8 KB
v3-0002-Use-WL_SOCKET_CLOSED-for-client_connection_check_.patch text/x-patch 4.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-12-11 04:44:22 Re: [PATCH] Don't block HOT update by BRIN index
Previous Message Andres Freund 2021-12-11 04:30:16 Re: Unifying VACUUM VERBOSE and log_autovacuum_min_duration output