Re: Make query cancellation keys longer

From: Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Make query cancellation keys longer
Date: 2024-03-03 14:27:35
Message-ID: CAGECzQSGzRE9eF=twhDgDXGKQVzrx=WhhVPEUmbQoDsHAVKQkg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 1 Mar 2024 at 06:19, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl> wrote:
> This is a preliminary review. I'll look at this more closely soon.

Some more thoughts after looking some more at the proposed changes

+#define EXTENDED_CANCEL_REQUEST_CODE PG_PROTOCOL(1234,5677)

nit: I think the code should be 1234,5679 because it's the newer
message type, so to me it makes more sense if the code is a larger
number.

+ * FIXME: we used to use signal_child. I believe kill() is
+ * maybe even more correct, but verify that.

signal_child seems to be the correct one, not kill. signal_child has
this relevant comment explaining why it's better than plain kill:

* On systems that have setsid(), each child process sets itself up as a
* process group leader. For signals that are generally interpreted in the
* appropriate fashion, we signal the entire process group not just the
* direct child process. This allows us to, for example, SIGQUIT a blocked
* archive_recovery script, or SIGINT a script being run by a backend via
* system().

+SendCancelRequest(int backendPID, int32 cancelAuthCode)

I think this name of the function is quite confusing, it's not sending
a cancel request, it is processing one. It sends a SIGINT.

> While we're at it, switch to using atomics in pmsignal.c for the state
> field. That feels easier to reason about than volatile
> pointers.

I feel like this refactor would benefit from being a separate commit.
That would make it easier to follow which change to pmsignal.c is
necessary for what.

+ MyCancelKeyLength = (MyProcPort != NULL &&
MyProcPort->extended_query_cancel) ? MAX_CANCEL_KEY_LENGTH : 4;

I think we should be doing this check the opposite way, i.e. only fall
back to the smaller key when explicitly requested:

+ MyCancelKeyLength = (MyProcPort != NULL &&
MyProcPort->old_query_cancel) ? 4 : MAX_CANCEL_KEY_LENGTH;

That way we'd get the more secure, longer key length for non-backend
processes such as background workers.

+ case EOF:
+ /* We'll come back when there is more data */
+ return PGRES_POLLING_READING;

Nice catch, I'll go steal this for my patchset which adds all the
necessary changes to be able to do a protocol bump[1].

+ int be_pid; /* PID of backend --- needed for XX cancels */

Seems like you accidentally added XX to the comment in this line.

[1]: https://www.postgresql.org/message-id/flat/CAGECzQSr2%3DJPJHNN06E_jTF2%2B0E60K%3DhotyBw5wY%3Dq9Wvmt7DQ%40mail.gmail.com#359e4222eb161da37124be1a384f8d92

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2024-03-03 14:29:22 Re: pgsql: Improve performance of subsystems on top of SLRU
Previous Message Alvaro Herrera 2024-03-03 13:58:39 Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock