Re: Make query cancellation keys longer

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Make query cancellation keys longer
Date: 2024-03-08 22:20:19
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/03/2024 07:19, Jelte Fennema-Nio wrote:
> I think this behaviour makes more sense as a minor protocol version
> bump instead of a parameter.
Ok, the consensus is to bump the minor protocol version for this. Works
for me.

> + if (strcmp(conn->, "_pq_.extended_query_cancel") == 0)
> + {
> + /* that's ok */
> + continue;
> + }
> Please see this thread[1], which in the first few patches makes
> pqGetNegotiateProtocolVersion3 actually usable for extending the
> protocol. I started that, because very proposed protocol change that's
> proposed on the list has similar changes to
> pqGetNegotiateProtocolVersion3 and I think we shouldn't make these
> changes ad-hoc hacked into the current code, but actually do them once
> in a way that makes sense for all protocol changes.

Thanks, I will take a look and respond on that thread.

>> Documentation is still missing
> I think at least protocol message type documentation would be very
> helpful in reviewing, because that is really a core part of this
> change.

Added some documentation. There's more work to be done there, but at
least the message type descriptions are now up-to-date.

> Based on the current code I think it should have a few changes:
> + int cancel_key_len = 5 + msgLength - (conn->inCursor - conn->inStart);
> +
> + conn->be_cancel_key = malloc(cancel_key_len);
> + if (conn->be_cancel_key == NULL)
> This is using the message length to determine the length of the cancel
> key in BackendKey. That is not something we generally do in the
> protocol. It's even documented: "Notice that although each message
> includes a byte count at the beginning, the message format is defined
> so that the message end can be found without reference to the byte
> count." So I think the patch should be changed to include the length
> of the cancel key explicitly in the message.

The nice thing about relying on the message length was that we could
just redefine the CancelRequest message to have a variable length
secret, and old CancelRequest with 4-byte secret was compatible with the
new definition too. But it doesn't matter much, so added an explicit
length field.

FWIW I don't think that restriction makes sense. Any code that parses
the messages needs to have the message length available, and I don't
think it helps with sanity checking that much. I think the documentation
is a little anachronistic. The real reason that all the message types
include enough information to find the message end is that in protocol
version 2, there was no message length field. The only exception that
doesn't have that property is CopyData, and it's no coincidence that it
was added in protocol version 3.

On 03/03/2024 16:27, Jelte Fennema-Nio wrote:
> On Fri, 1 Mar 2024 at 06:19, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl> wrote:
>> This is a preliminary review. I'll look at this more closely soon.
> Some more thoughts after looking some more at the proposed changes
> nit: I think the code should be 1234,5679 because it's the newer
> message type, so to me it makes more sense if the code is a larger
> number.

Unfortunately 1234,5679 already in use by NEGOTIATE_SSL_CODE. That's why
I went the other direction. If we want to add this to "the end", it
needs to be 1234,5681. But I wanted to keep the cancel requests together.

> + * FIXME: we used to use signal_child. I believe kill() is
> + * maybe even more correct, but verify that.
> signal_child seems to be the correct one, not kill. signal_child has
> this relevant comment explaining why it's better than plain kill:
> * On systems that have setsid(), each child process sets itself up as a
> * process group leader. For signals that are generally interpreted in the
> * appropriate fashion, we signal the entire process group not just the
> * direct child process. This allows us to, for example, SIGQUIT a blocked
> * archive_recovery script, or SIGINT a script being run by a backend via
> * system().

I changed it to signal the process group if HAVE_SETSID, like
pg_signal_backend() does. We don't need to signal both the process group
and the process itself like signal_child() does, because we don't have
the race condition with recently-forked children that signal_child()
talks about.

> +SendCancelRequest(int backendPID, int32 cancelAuthCode)
> I think this name of the function is quite confusing, it's not sending
> a cancel request, it is processing one. It sends a SIGINT.

Heh, well, it's sending the cancel request signal to the right backend,
but I see your point. Renamed to ProcessCancelRequest.

>> While we're at it, switch to using atomics in pmsignal.c for the state
>> field. That feels easier to reason about than volatile
>> pointers.
> I feel like this refactor would benefit from being a separate commit.
> That would make it easier to follow which change to pmsignal.c is
> necessary for what.

Point taken. I didn't do that yet, but it makes sense.

> + MyCancelKeyLength = (MyProcPort != NULL &&
> MyProcPort->extended_query_cancel) ? MAX_CANCEL_KEY_LENGTH : 4;
> I think we should be doing this check the opposite way, i.e. only fall
> back to the smaller key when explicitly requested:
> + MyCancelKeyLength = (MyProcPort != NULL &&
> MyProcPort->old_query_cancel) ? 4 : MAX_CANCEL_KEY_LENGTH;
> That way we'd get the more secure, longer key length for non-backend
> processes such as background workers.

+1, fixed.

On 03/03/2024 19:27, Jelte Fennema-Nio wrote:
> On Sun, 3 Mar 2024 at 15:27, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl> wrote:
>> + case EOF:
>> + /* We'll come back when there is more data */
>> Nice catch, I'll go steal this for my patchset which adds all the
>> necessary changes to be able to do a protocol bump[1].
> Actually, it turns out your change to return PGRES_POLLING_READING on
> EOF is incorrect (afaict). A little bit above there is this code
> comment above a check to see if the whole body was received:
> * Can't process if message body isn't all here yet.
> *
> * After this check passes, any further EOF during parsing
> * implies that the server sent a bad/truncated message.
> * Reading more bytes won't help in that case, so don't return
> * PGRES_POLLING_READING after this point.
> So I'll leave my patchset as is.

Yep, thanks.

One consequence of this patch that I didn't mention earlier is that it
makes libpq incompatible with server version 9.2 and below, because the
minor version negotiation was introduced in version 9.3. We could teach
libpq to disconnect and reconnect with minor protocol version 3.0, if
connecting with 3.1 fails, but IMHO it's better to not complicate this
and accept the break in backwards-compatibility. 9.3 was released in
2013. We dropped pg_dump support for versions older than 9.2 a few years
ago, this raises the bar for pg_dump to 9.3 as well.

Heikki Linnakangas
Neon (

Attachment Content-Type Size
v2-0001-Move-cancel-key-generation-to-after-forking-the-b.patch text/x-patch 24.4 KB
v2-0002-Make-cancel-request-keys-longer-bump-minor-protoc.patch text/x-patch 32.3 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-03-08 22:24:15 Re: Streaming I/O, vectored I/O (WIP)
Previous Message Nathan Bossart 2024-03-08 22:17:58 Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value