Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2022-01-11 23:27:21
Message-ID: 2912315.1641943641@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com> writes:
> Attached are 3 patches that address the feedback from Andres about code duplication
> and splitting up commits. I completely removed internal_cancel now, since it only had
> one caller at this point.

Here's some cleaned-up versions of 0001 and 0002. I have not bothered
with 0003 because I for one will not be convinced to commit it. The
risk-vs-reward ratio looks far too high on that, and it's not obvious
why 0002 doesn't already solve whatever problem there is.

A couple of notes:

0001 introduces malloc/free into PQrequestCancel, which promotes it
from being "probably unsafe in a signal handler" to "definitely unsafe
in a signal handler". Before, maybe you could have gotten away with
it if you were sure the PGconn object was idle, but now it's no-go for
sure. I don't have a big problem with that, given that it's been
deprecated for decades, but I made the warning text in libpq.sgml a
little stronger.

As for 0002, I don't see a really good reason why we shouldn't try
to do it on Windows too. If connect() will work, then it seems
likely that setsockopt() and WSAIOCtl() will too. Moreover, note
that at least in our own programs, PQcancel doesn't *really* run in a
signal handler on Windows: see commentary in src/fe_utils/cancel.c.
(The fact that we now have test coverage for PQcancel makes me a lot
more willing to try this than I might otherwise be. Will be
interested to see the cfbot's results.)

Also, I was somewhat surprised while working on this to realize
that PQconnectPoll doesn't call setTCPUserTimeout if keepalives
are disabled per useKeepalives(). I made PQcancel act the same,
but I wonder if that was intentional or a bug. I'd personally
have thought that those settings were independent.

regards, tom lane

Attachment Content-Type Size
v2-0001-Refactor-to-remove-internal_cancel-helper.patch text/x-diff 5.4 KB
v2-0002-Use-timeout-socket-options-for-cancel-connections.patch text/x-diff 7.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2022-01-12 00:19:07 Re: Skipping logical replication transactions on subscriber side
Previous Message Bossart, Nathan 2022-01-11 22:18:24 Re: Add index scan progress to pg_stat_progress_vacuum