Re: Tracking last scan time

From: Dave Page <dpage(at)pgadmin(dot)org>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Tracking last scan time
Date: 2022-08-24 13:01:15
Message-ID: CA+OCxoxPSQukV7kuRuJ3nyrXX44x+bdd3n2HaT8LzCSzFyV3JQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

On Tue, 23 Aug 2022 at 13:07, Greg Stark <stark(at)mit(dot)edu> wrote:

> On Tue, 23 Aug 2022 at 11:00, Dave Page <dpage(at)pgadmin(dot)org> wrote:
> >
> > Often it is beneficial to review one's schema with a view to removing
> indexes (and sometimes tables) that are no longer required. It's very
> difficult to understand when that is the case by looking at the number of
> scans of a relation as, for example, an index may be used infrequently but
> may be critical in those times when it is used.
>
> I think this is easy to answer in a prometheus/datadog/etc world since
> you can consult the history of the count to see when it was last
> incremented. (Or do effectively that continously).
>

Yes. But not every PostgreSQL instance is monitored in that way.

>
> I guess that just reinforces the idea that it should be optional.
> Perhaps there's room for some sort of general feature for controlling
> various time series aggregates like max() and min() sum() or, uhm,
> timeoflastchange() on whatever stats you want. That would let us
> remove a bunch of stuff from pg_stat_statements and let users turn on
> just the ones they want. And also let users enable things like time of
> last rollback or conflict etc. But that's just something to think
> about down the road.
>

It's certainly an interesting idea.

>
> > The attached patch against HEAD adds optional tracking of the last scan
> time for relations. It updates pg_stat_*_tables with new last_seq_scan and
> last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to
> help with this.
> >
> > Due to the use of gettimeofday(), those values are only maintained if a
> new GUC, track_scans, is set to on. By default, it is off.
>
> Bikeshedding warning -- "track_scans" could equally apply to almost
> any stats about scans. I think the really relevant thing here is the
> times, not the scans. I think the GUC should be "track_scan_times". Or
> could that still be confused with scan durations? Maybe
> "track_scan_timestamps"?
>

The latter seems reasonable.

>
> You could maybe make the gettimeofday cheaper by doing it less often.
> Like, skipping the increment if the old timestamp is newer than 1s
> before the transaction start time (I think that's available free if
> some other guc is enabled but I don't recall). Or isn't this cb
> normally happening after transaction end? So xactStopTimestamp might
> be available already?
>

Something like:

if (pgstat_track_scan_timestamps && lstats->t_counts.t_numscans &&
tabentry->lastscan + USECS_PER_SEC <
GetCurrentTransactionStopTimestamp())
tabentry->lastscan = GetCurrentTimestamp();

?

--
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-08-24 13:30:18 Re: Strip -mmacosx-version-min options from plperl build
Previous Message Natarajan R 2022-08-24 13:00:29 Re: Logical replication support for generic wal record