Re: Don't use the deprecated and insecure PQcancel in our frontend tools anymore

From: "Jelte Fennema-Nio" <postgres(at)jeltef(dot)nl>
To: "Heikki Linnakangas" <hlinnaka(at)iki(dot)fi>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Alvaro Herrera" <alvherre(at)alvh(dot)no-ip(dot)org>, "Jacob Champion" <jacob(dot)champion(at)enterprisedb(dot)com>
Subject: Re: Don't use the deprecated and insecure PQcancel in our frontend tools anymore
Date: 2026-03-06 02:12:49
Message-ID: DGVC3GZ0U1NS.3P1XEU5E065XQ@jeltef.nl
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu Mar 5, 2026 at 7:30 PM CET, Heikki Linnakangas wrote:
> It took me a while to get the big picture of how this works. cancel.c
> could use some high-level comments explaining how to use the facility;
> it's a real mixed bag right now.

Attached is a version with a bunch more comments. I agree this cancel
logic is hard to understand without them. It took me quite a while to
understand it myself. (I don't think the code got any harder to
understand with these changes though, the exact same complexity was
already there for Windows. But I agree more commends are good.)

> The cancelConn mechanism is a global variable, which means that it can
> only be used with one connection in the process. That's OK with the
> current callers, but seems short-sighted. What if we wanted to use it
> for pgbench, for example, which uses multiple threads and connections?
> Or if we changed pg_dump to use multiple threads, like you also
> suggested as a possible follow-up.

Allowing for multiple callers seems like scope-creep to me, and also
hard to do in a generic way. I'd say IF we want that in a generic way
I'd first want to see a version of that that solves the problem for
pgdench/pg_dump, before generalizing it to cancel.c.

> This is racy, if the cancellation thread doesn't immediately process the
> wakeup. For example, because it's still busy processing a previous
> wakeup, because there's a network hiccup or something. By the time the
> cancellation thread runs, the main thread might already be running a
> different query than it was when the user hit CTRL-C.

I now noted this in one of the new comments. I don't think there's a way
around this race condition entirely. It's simply a limitation of our
cancel protocol (because it's impossible to specify which query on a
connection should be cancelled).

In theory we could reduce the window for the race, by having all
frontend tools use async connections and have the main thread wait for
either the self-pipe or a cancel. That way it would be more similar to
the previous signal code in behaviour. That's a much bigger lift though,
i.e. all PQexec and PQgetResult calls would need to be modified. My
proposed change doesn't require changing the callsites at all.

In interactive usage of psql it seems pretty unlikely that people will
hit this race condition. In non-interactive use, it doesn't matter
because you just want Ctrl-C to cancel the application (whichever query
it currently runs). So I'd say it's currently not worth the complexity
to do the less racy option.

Attachment Content-Type Size
v4-0001-Move-Windows-pthread-compatibility-functions-to-s.patch text/x-patch 2.9 KB
v4-0002-Don-t-use-deprecated-and-insecure-PQcancel-psql-a.patch text/x-patch 15.9 KB
v4-0003-pg_dump-Don-t-use-the-deprecated-and-insecure-PQc.patch text/x-patch 23.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2026-03-06 02:17:35 Re: UPDATE run check constraints for affected columns only
Previous Message David Steele 2026-03-06 01:27:38 Re: Return pg_control from pg_backup_stop().