Re: [PATCH] Add last_executed timestamp to pg_stat_statements

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Christoph Berg <myon(at)debian(dot)org>
Cc: Pavlo Golub <pavlo(dot)golub(at)cybertec(dot)at>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Add last_executed timestamp to pg_stat_statements
Date: 2026-02-09 16:51:30
Message-ID: CAA5RZ0sxPWP2xm8fxhscE+cUqC2VSFi9UZ9882BdGZ0MbGQNUA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > How about
> > TimestampTz stmt_end = TimestampTzPlusMilliseconds(
> > GetCurrentStatementStartTimestamp(),
> > (int64) total_time
> > );
> > We have total_time as an argument already! No kernel calls, sweet and easy!
>
> Cool idea!

This calculation could be wrong for very common cases in extended
query protocol,

Here is a script to test with:

```
select pg_stat_statements_reset();

BEGIN;
select now() as now, clock_timestamp() as clock_timestamp,
pg_sleep($1) \bind 10 \g

\! sleep 10

SELECT now() as now, clock_timestamp() as clock_timestamp, $1 \bind 1 \g
END;

select stats_last_updated, total_exec_time, substr(query, 1, 150) as
query from pg_stat_statements;

````

With v3 applied, notice the output is calculating a stats_last_updated
that is beyond the current time

```
pg_stat_statements_reset
-------------------------------
2026-02-09 16:13:35.188849+00
(1 row)

BEGIN
now | clock_timestamp | pg_sleep
------------------------------+-------------------------------+----------
2026-02-09 16:13:35.18911+00 | 2026-02-09 16:13:35.189397+00 |
(1 row)

now | clock_timestamp | ?column?
------------------------------+-------------------------------+----------
2026-02-09 16:13:35.18911+00 | 2026-02-09 16:13:55.193443+00 | 1
(1 row)

COMMIT
stats_last_updated | total_exec_time |
query
-------------------------------+-----------------+-------------------------------------------------------------------------
2026-02-09 16:13:55.19367+00 | 0.007401 | SELECT now() as
now, clock_timestamp() as clock_timestamp, $1
2026-02-09 16:13:55.193664+00 | 0.00103 | END
2026-02-09 16:13:35.189111+00 | 0.00098 | BEGIN
2026-02-09 16:13:35.188584+00 | 0.090183 | select
pg_stat_statements_reset()
2026-02-09 16:14:05.194134+00 | 10000.751122 | select now() as
now, clock_timestamp() as clock_timestamp, pg_sleep($1)
(5 rows)
```

This happens because in the case of extended query protocol,
ExecutorEnd is called
at the next query. This has been discussed in [1] [2].

So, for this to work, we will likely need to store the query start
time in the queryDesc; actually
queryDesc->totaltime, and set the query start time at ExecutorStart,
during InstrAlloc.

> > I think it's better because last_execution_start is already a known
> > timestamp in pg_stat_activity.query_start and some tool that finds a
> > long running query in pg_stat_activity, knowing the
> > query_start they could then go look it up in pg_stat_statements.
>
> That only works if a) the query was not yet overwritten in
> pg_stat_activity and b) neither in pg_stat_statements. Optimizing for
> that use case seems pretty narrow.
>
> > What I'm really getting at is separating these fields will open up
> > more use cases, IMO.

Maybe this is a bad use case. But I felt separating these 2 fields will
be more flexible.

> Generally, I think pgss should have cumulative statistics, and less
> about individual executions, so I'm not really sure what practical
> problem "last start" and "last runtime" would solve. The
> last_stats_update column we are talking about here is different in the
> sense that it's not about an individual execution, but infrastructure
> for retrieving the stats sensibly.

Sure, generally, pg_stat_statements is for cumulative stats, but we also
do have computed stats such as max/min/stddev, etc. But, it's not without
precedent that we track timestamps of the last time some operation occurred.
We do that in views that have a purpose of tracking cumulative data, because
these timestamps are useful. See pg_stat_all_tables.last_seq_scan or
last_autovacuum
as an example.

Maybe having the last runtime column is not that valuable if we can
correctly calculate
the last execution time. AlsoI will be a strong -1 calling this field
"stats_last_updated"
instead of "last_execution_time".

[1] https://www.postgresql.org/message-id/3d07ee43-8855-42db-97e0-bad5db82d972@dalibo.com
[2] https://www.postgresql.org/message-id/CAA5RZ0t2+GLnE_55L2cfCay+L8yPFpdPRVQo-JswUFgXy-EK5Q@mail.gmail.com

--
Sami Imseih
Amazon Web Services (AWS)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2026-02-09 17:05:50 Re: pg_upgrade: transfer pg_largeobject_metadata's files when possible
Previous Message Alexandra Wang 2026-02-09 15:55:17 Re: pg_plan_advice