Re: Tracking last scan time

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Tracking last scan time
Date: 2022-09-01 12:18:42
Message-ID: CAEze2Whn3UG0RHYGXtKfN9BX9m3F53Fh813_FByhC2XF1DErvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 31 Aug 2022 at 20:56, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2022-08-31 19:52:49 +0200, Matthias van de Meent wrote:
> > As for having a lower granularity and preventing the
> > one-syscall-per-Relation issue, can't we reuse the query_start or
> > state_change timestamps that appear in pg_stat_activity (potentially
> > updated immediately before this stat flush), or some other per-backend
> > timestamp that is already maintained and considered accurate enough
> > for this use?
>
> The problem is that it won't change at all for a query that runs for a week -
> and we'll report the timestamp from a week ago when it finally ends.

This earlier proposal to reuse pg_stat_activity values is also invalid
because those timestamps don't exist when you SET track_activities =
OFF.

> But given this is done when stats are flushed, which only happens after the
> transaction ended, we can just use GetCurrentTransactionStopTimestamp() - if
> we got to flushing the transaction stats we'll already have computed that.

I'm not entirely happy with that, as that would still add function
call overhead, and potentially still call GetCurrentTimestamp() in
this somewhat hot loop.

As an alternative, we could wire the `now` variable in
pgstat_report_stat (generated from
GetCurrentTransactionStopTimestamp() into pgstat_flush_pending_entries
and then into flush_pending_cb (or, store this in a static variable)
so that we can reuse that value, saving any potential function call
overhead.

> > tabentry->numscans += lstats->t_counts.t_numscans;
> > + if (pgstat_track_scans && lstats->t_counts.t_numscans)
> > + tabentry->lastscan = GetCurrentTimestamp();
>
> Besides replacing GetCurrentTimestamp() with
> GetCurrentTransactionStopTimestamp(), this should then also check if
> tabentry->lastscan is already newer than the new timestamp.

I wonder how important that is. This value only gets set in a stats
flush, which may skew the stat update by several seconds (up to
PGSTAT_MAX_INTERVAL). I don't expect concurrent flushes to take so
long that it will set the values to It is possible, but I think it is
extremely unlikely that this is going to be important when you
consider that these stat flushes are not expected to run for more than
1 second.

Kind regards,

Matthias van de Meent

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2022-09-01 12:21:18 Re: TAP output format in pg_regress
Previous Message Dave Page 2022-09-01 12:18:00 Re: Tracking last scan time