| From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
|---|---|
| To: | Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Fix tracing of BackendKeyData and CancelRequest messages |
| Date: | 2026-07-03 12:10:30 |
| Message-ID: | a9b48f42-bbb4-4fee-9d27-53668f41f434@iki.fi |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 03/07/2026 12:42, Anthonin Bonnefoy wrote:
> BackendKeyData length was increased from 4 bytes to a variable sized
> length (up to 256 bytes) in a460251f0a. However, PQtrace still traces
> it as a 4 bytes key, leading to a "mismatched message length" warning
> message. The same issue impacts the tracing of CancelRequest.
>
> This shows the lack of coverage of tracing startup packets:
> libpq_pipeline only starts traces once the connection is established.
> There's also no way to test this with psql.
>
> The attached patchset fixes the traces BackendKeyData and
> CancelRequest, add the possibility of tracing any connection created
> by libpq, and add test coverage for the startup packets. It contains
> the following files:
>
> 0001: Fix tracing of BackendKeyData and CancelRequest messages
> @@ -762,7 +762,7 @@ pqTraceOutputMessage(PGconn *conn, const char *message, bool toServer)
> /* No message content */
> break;
> case PqMsg_BackendKeyData:
> - pqTraceOutput_BackendKeyData(conn->Pfdebug, message, &logCursor, regress);
> + pqTraceOutput_BackendKeyData(conn->Pfdebug, message, conn->be_cancel_key_len, &logCursor, regress);
> break;
> case PqMsg_NoData:
> fprintf(conn->Pfdebug, "NoData");
This should use the 'length' parsed from the message, instead of relying
on conn->be_cancel_key_len. Same in pqTraceOutputNoTypeByteMessage(). I
changed that, and committed and backpatched. Thanks!
I did not look closely at the rest of the patches, but more test
coverage sounds like a good idea. Some quick comments just based on the
descriptions though:
> 0002: Add PGTRACE env var in libpq to enable protocol tracing to the
> target file (or '-' for stdout). One limitation of this approach is
> that you should only have one connection writing to a trace file. If
> an application opens multiple connections to the same PGTRACE, each
> connection will open its own fd and erase others traces. But given
> this is to debug protocol messages, it sounds like an acceptable
> limitation.
While testing just the first patch, I noticed that PQcancel() doesn't do
the tracing even if it was enabled in the main connection, which made it
a little hard to test. I didn't try this patch, but I hope it fixes that
problem too.
> 0003: Replace libpq_pipeline PQtrace calls by PGTRACE, adding tracing
> of startup messages, with some additional regress masking. I've also
> added a warning message to report mismatch length similar to what's
> done in pqTraceOutputMessage.
Could you achieve the same by calling PQtrace() earlier? Perhaps switch
to PQconnectStart() interface, and call PQtrace() right after
PQconnectStart()?
> 0004: Add tracing of cancel requests in libpq_pipeline test. Calling
> PQuntrace on the main and monitor connections was necessary to allow
> the cancel connections to append the cancel messages without conflict.
Seems unrelated to pipelining, so perhaps this should go somewhere else.
Then again, we already crossed that bridge when we added the
"protocol_version" test there, which is also unrelated to pipelining.
Maybe we should rename the test module, or add some comments noting that
it's not just for pipelining anymore.
As a side note, I find the output from pqTraceOutputNchar() awkward for
things like the cancel key, which is raw binary bytes:
> BackendKeyData 711421 'm[\x80\x95\xd6\x81{I\x8b\x04>\xab\x7fwMG\xda!\xc1\xb8\xa7\x8d\xb0.\xcd\xaa\x15\x9c\x0f\xd2\xe8\x1b'
We don't need to invest too much in making the trace output pretty, but
I think it would make sense to print that as a hex string.
- Heikki
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tatsuo Ishii | 2026-07-03 12:13:09 | Re: Row pattern recognition |
| Previous Message | Ranier Vilela | 2026-07-03 12:02:28 | lazy_scan_heap: avoid vacuum block, already vacuumed (src/backend/access/heap/vacuumlazy.c) |