Re: pg_stat_statements: Add `calls_aborted` counter for tracking query cancellations

From: Benoit T <benoit(dot)tigeot(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements: Add `calls_aborted` counter for tracking query cancellations
Date: 2025-08-15 16:55:39
Message-ID: 4C0ECE11-C922-479F-9C5E-5A42F7BAF07F@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 14 Aug 2025, at 10:18, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> That seems kind of limited to me in scope. The executor is only one
> part of the system. I would have considered using an xact callback
> when a transaction is aborted if I were to do a patch like the one you
> are proposing, to know how many times a transaction is failing at a
> specific phase, because you should know the latest query_id in this
> case to be able to put a counter update in the correct slot (right?).
>
> +-- aborted calls tracking
> +SELECT pg_sleep(0.5);
> + pg_sleep
> +----------
> +
> +(1 row)
>
> Using hardcoded sleep times for deterministic tests is never a good
> idea. On fast machines, they eat time for nothing. And if not
> written correctly, they may not achieve their goal on slow machines
> because the sleep threshold may be reached before the custom action is
> taken. If you want to force a failure, you should just use a SQL that
> you know would fail at execution time (based on your implementation
> expects).

Hello,

Thanks for the review.

On scope and callbacks
- I prototyped a version with RegisterXactCallback and RegisterSubXactCallback but I’m not very happy with the result. Same limitation as v4: we only have a stable identity (queryId) after parse/analyze. The xact/subxact callbacks fire with no statement context, so you still need to know “which statement was active” at abort time.
- To deal with that, I keep a small backend-local tracker of the current statement’s key (userid, dbid, queryId, toplevel) when we enter the executor, mark it completed on success, and bump calls_aborted in the callbacks only if active && !completed.
- This reliably captures executor-time failures and timeouts. It cannot attribute errors that happen before post-parse (syntax/name resolution), because queryId isn’t set yet. That’s the hard boundary. Planner failures aren’t attributed in this version because the tracker is set in ExecutorRun..still.

On the value of callbacks
- It seems callbacks alone are insufficient (no per-statement context).
- If we want to preserve pgss semantics (only report statements that reached post-parse and have a queryId), the v4 approach is simpler and safer; I could look into planner error handling in a next patch version.
- If we also want to count very-early errors, that implies raw-text keyed entries (no queryId), which changes pg_stat_statements’ semantics (normalization, PII risk, cardinality). That probably should be separate extension as Julien Rouhaud suggested: https://www.postgresql.org/message-id/aJ2ov4KYM2vX4uAs%40jrouhaud. Not sure I would be able to make this, also I need to have something that will be supported by managed PostgreSQL with extension restrictions.

Happy to adjust if you see a better way to thread “latest queryId” into the abort path without the local tracker.

On pre-creating entries earlier
- I considered creating entries from raw text to count very-early errors, but I’ve avoided it: pgss is keyed on the normalized parse tree (queryId). Pre-hashing raw text weakens normalization, risks storing sensitive literals before jumbling, and adds lock churn in hot paths.

Tests
- Agreed on sleep flakiness. I switched to deterministic executor-time failures: duplicate key (primary key violation) and a check constraint violation. Both trip calls_aborted without timing assumptions.

Also, are we okay tracking statements that never reach post-parse (i.e., no queryId)? That would effectively key on raw text, which changes pg_stat_statements’ semantics.

Benoit

Attachment Content-Type Size
patch_pgss_calls_aborted_v5.patch application/octet-stream 10.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-08-15 16:57:02 Re: Making jsonb_agg() faster
Previous Message Tom Lane 2025-08-15 16:52:06 Re: Improve hash join's handling of tuples with null join keys