Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
Date: 2022-09-14 21:53:31
Message-ID: 2906785.1663192411@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com> writes:
> [ non-blocking PQcancel ]

I pushed the 0001 patch (libpq_pipeline documentation) with a bit
of further wordsmithing.

As for 0002, I'm not sure that's anywhere near ready. I doubt it's
a great idea to un-deprecate PQrequestCancel with a major change
in its behavior. If there is anybody out there still using it,
they're not likely to appreciate that. Let's leave that alone and
pick some other name.

I'm also finding the entire design of PQrequestCancelStart etc to
be horribly confusing --- it's not *bad* necessarily, but the chosen
function names are seriously misleading. PQrequestCancelStart doesn't
actually "start" anything, so the apparent parallel with PQconnectStart
is just wrong. It's also fairly unclear what the state of a cancel
PQconn is after the request cycle is completed, and whether you can
re-use it (especially after a failed request), and whether you have
to dispose of it separately.

On the whole it feels like a mistake to have two separate kinds of
PGconn with fundamentally different behaviors and yet no distinction
in the API. I think I'd recommend having a separate struct type
(which might internally contain little more than a pointer to a
cloned PGconn), and provide only a limited set of operations on it.
Seems like create, start/continue cancel request, destroy, and
fetch error message ought to be enough. I don't see a reason why we
need to support all of libpq's inquiry operations on such objects ---
for instance, if you want to know which host is involved, you could
perfectly well query the parent PGconn. Nor do I want to run around
and add code to every single libpq entry point to make it reject cancel
PGconns if it can't support them, but we'd have to do so if there's
just one struct type.

I'm not seeing the use-case for PQconnectComplete. If you want
a non-blocking cancel request, why would you then use a blocking
operation to complete the request? Seems like it'd be better
to have just a monolithic cancel function for those who don't
need non-blocking.

This change:

--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -59,12 +59,15 @@ typedef enum
{
CONNECTION_OK,
CONNECTION_BAD,
+ CONNECTION_CANCEL_FINISHED,
/* Non-blocking mode only below here */

is an absolute non-starter: it breaks ABI for every libpq client,
even ones that aren't using this facility. Why do we need a new
ConnStatusType value anyway? Seems like PostgresPollingStatusType
covers what we need: once you reach PGRES_POLLING_OK, the cancel
request is done.

The test case is still not very bulletproof on slow machines,
as it seems to be assuming that 30 seconds == forever. It
would be all right to use $PostgreSQL::Test::Utils::timeout_default,
but I'm not sure that that's easily retrievable by C code.
Maybe make the TAP test pass it in with another optional switch
to libpq_pipeline? Alternatively, we could teach libpq_pipeline
to do getenv("PG_TEST_TIMEOUT_DEFAULT") with a fallback to 180,
but that feels like it might be overly familiar with the innards
of Utils.pm.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sandro Santilli 2022-09-14 22:01:00 Re: [PATCH] Support % wildcard in extension upgrade filenames
Previous Message David Rowley 2022-09-14 21:44:26 Re: Fix comment in convert_saop_to_hashed_saop