Re: [PATCH v4] pg_stat_statements: Add last_execution_start column

From: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
To: Pavlo Golub <pavlo(dot)golub(at)cybertec(dot)at>
Cc: Sami Imseih <samimseih(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH v4] pg_stat_statements: Add last_execution_start column
Date: 2026-06-04 10:00:56
Message-ID: CAO6_XqoRLxL2+FsaE3JbwOH+oZ=CAjN8yK9-+p7QeH_LY5B+ag@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

The code looks good. Some comments on the tests:

+-- last_execution_start timestamp tests
+--
+-- Reset stats first to avoid queryId collisions: simple "SELECT
const AS alias"
+-- queries all share the same normalized structure as the STMTTS queries above,
+-- so EXECSTART entries would otherwise land on the pre-existing STMTTS entry.
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;

I don't think the comment on queryId collision is necessary. Looking
at other tests, it's a common pattern to do a pgss reset before
starting a set of tests.

+-- last_execution_start should be set and >= ref_ts_upd1, because the
+-- statement started after we captured the reference timestamp.
+SELECT
+ query,
+ last_execution_start IS NOT NULL as has_ts,
+ last_execution_start >= :'ref_ts_upd1' as after_ref1,
+ stats_since <= last_execution_start as after_stats_since
+FROM pg_stat_statements
+WHERE query LIKE '%EXECSTART%'
+ORDER BY query COLLATE "C";

The test is also checking stats_since and shows that
last_execution_start < stats_since, is it necessary? The result is
also a bit confusing, but makes sense since stats_since is the time
the entry is created in pgss, which happens before ExecutorEnd.

+-- Run EXECSTART1 again and verify that last_execution_start is updated.
+SELECT now() AS ref_ts_upd2 \gset
+SELECT 1 AS "EXECSTART1";
+SELECT
+ query,
+ last_execution_start >= :'ref_ts_upd2' as updated
+FROM pg_stat_statements
+WHERE query LIKE '%EXECSTART1%';
+
+-- test filtering (monitoring use case): find statements that started
+-- executing since our last observation (ref_ts_upd2).
+SELECT count(*) as filtered_count
+FROM pg_stat_statements
+WHERE last_execution_start >= :'ref_ts_upd2'
+ AND query LIKE '%EXECSTART%';

The 'test filtering' feels redundant, we already have the list of
queries executed after ref_ts_upd2.

+SELECT now() AS ref_ts_ext \gset
+-- Use \bind \g to force the extended query protocol.
+SELECT pg_sleep(0.5) AS "DEFERRED_END" \bind \g
+-- Capture a timestamp *after* the sleep finishes but *before* the next
+-- extended-protocol statement replaces the unnamed portal.
+SELECT now() AS ref_ts_ext2 \gset
+-- The pg_sleep query's last_execution_start should be close to ref_ts_ext
+-- (before the sleep), NOT to ref_ts_ext2 (after the sleep).

I think the test doesn't work, or at least, doesn't check what you're
expecting. If I set last_execution_start to
GetCurrentStatementStartTimestamp(), it still passes.

When using the extended protocol with \bind \g, a Sync message is
immediately sent, and since you're in an implicit transaction,
finish_xact_command will drop the portal and call ExecutorEnd. Also,
doing a `SELECT now()` closes the unnamed portal, even when using the
simple protocol.

You could do something like that (and avoid a pg_sleep):

BEGIN;
SELECT 1 AS DEFERRED_END, statement_timestamp() AS query_stmt_ts \bind \gset
-- With an explicit transaction + extended protocol, the portal is left opened
-- ExecutorEnd will only be called when the next command is processed
END;

SELECT query, last_execution_start < :'query_stmt_ts' as before_next
FROM pg_stat_statements
WHERE query LIKE '%DEFERRED_END%'
ORDER BY query COLLATE "C";

With entry->last_execution_start =
GetCurrentStatementStartTimestamp(), you will have
last_execution_start > query_stmt_ts since it will be using END's
statement_start.
With entry->last_execution_start = exec_start, you use the statement
start set during the Bind message, which should be smaller than
statement_timestamp() (which is the statement start set during the
Execute message).

On that note, I wonder if relying on
GetCurrentStatementStartTimestamp() wouldn't be good enough? That
would remove the need to keep track of this state in EState.

Regards,
Anthonin Bonnefoy

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2026-06-04 10:04:25 Re: FOR PORTION OF does not recompute GENERATED STORED columns that depend on the range column
Previous Message Peter Eisentraut 2026-06-04 09:58:48 Re: FOR PORTION OF does not recompute GENERATED STORED columns that depend on the range column