Re: Add client connection check during the execution of the query

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add client connection check during the execution of the query
Date: 2021-03-22 02:29:51
Message-ID: CA+hUKGKfRypYnXMbXa=yMP4K+kexY4doZxZxLQv=zqaZsJ016Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 6, 2021 at 5:50 PM Zhihong Yu <zyu(at)yugabyte(dot)com> wrote:
> + if (client_connection_check_interval > 0)
> + enable_timeout_after(CLIENT_CONNECTION_CHECK_TIMEOUT,
>
> + /* Start timeout for checking if the client has gone away if necessary. */
> + if (client_connection_check_interval)

Thanks! Fixed.

> + Sets a time interval, in milliseconds, between periodic
>
> I wonder if the interval should be expressed in seconds. Since the description says: while running very long queries.

Hmm. Personally I think we should stop emphasising default units
(it's an internal detail) and encourage people to specify the units.
But that was indeed not good wording, so I fixed it. I've now
rewritten the documentation completely, and moved it to the connection
section near the related TCP keepalive settings. I tried to be really
explicit about what this feature really does, and in which cases it
helps, and what behaviour you can expect without this feature
configured.

Other changes:

1. Fixed inconsistencies in .conf.sample and GUC descriptions from by
Ishii-san's review.
2. Added comments to pg_check_client_connection() to explain the
conditional logic therein, justifying each conditional branch and the
interactions between this logic and the PqCommReadingMsg and
ClientConnectionLost variables.
3. I added an ereport(COMMERROR) message for unexpected errnos in
this path, since otherwise an errno would be discarded making
diagnosis impossible.
4. Avoided calling enable_timeout_after() multiple times, like we do
for statement timeouts.
5. cfbot told me "Could not determine contrib module type for
connection" on Windows. I do not understand this stuff, so I just
added the new test module "connection" to @contrib_excludes, like
everyone else apparently does.
6. pgindented.

That's enough for today, but here are some things I'm still wondering about:

1. Might need to rethink the name of the GUC. By including "client"
in it, it sounds a bit like it affects behaviour of the client, rather
than the server. Also the internal variable
CheckClientConnectionPending looks funny next to ClientConnectionLost
(could be ClientConnectionCheckPending?).
2. The tests need tightening up. The thing with the "sleep 3" will
not survive contact with the build farm, and I'm not sure if the SSL
test is as short as it could be.
3. Needs testing on Windows.

I've now hacked this code around so much that I've added myself as
co-author in the commit message.

Attachment Content-Type Size
v5-0001-Detect-dropped-connections-while-running-queries.patch text/x-patch 18.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2021-03-22 02:36:27 Re: Why logical replication lancher exits 1?
Previous Message Yugo NAGATA 2021-03-22 02:29:25 Re: Columns correlation and adaptive query optimization