Re: Tracking last scan time

From: Greg Stark <stark(at)mit(dot)edu>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Tracking last scan time
Date: 2022-08-23 12:07:05
Message-ID: CAM-w4HMCByFm6vFdhXOSF=bA+RGmr0wXyHLwGmT-bUUAbHb6Aw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 23 Aug 2022 at 11:00, Dave Page <dpage(at)pgadmin(dot)org> wrote:
>
> Often it is beneficial to review one's schema with a view to removing indexes (and sometimes tables) that are no longer required. It's very difficult to understand when that is the case by looking at the number of scans of a relation as, for example, an index may be used infrequently but may be critical in those times when it is used.

I think this is easy to answer in a prometheus/datadog/etc world since
you can consult the history of the count to see when it was last
incremented. (Or do effectively that continously).

I guess that just reinforces the idea that it should be optional.
Perhaps there's room for some sort of general feature for controlling
various time series aggregates like max() and min() sum() or, uhm,
timeoflastchange() on whatever stats you want. That would let us
remove a bunch of stuff from pg_stat_statements and let users turn on
just the ones they want. And also let users enable things like time of
last rollback or conflict etc. But that's just something to think
about down the road.

> The attached patch against HEAD adds optional tracking of the last scan time for relations. It updates pg_stat_*_tables with new last_seq_scan and last_idx_scan columns, and pg_stat_*_indexes with a last_idx_scan column to help with this.
>
> Due to the use of gettimeofday(), those values are only maintained if a new GUC, track_scans, is set to on. By default, it is off.

Bikeshedding warning -- "track_scans" could equally apply to almost
any stats about scans. I think the really relevant thing here is the
times, not the scans. I think the GUC should be "track_scan_times". Or
could that still be confused with scan durations? Maybe
"track_scan_timestamps"?

You could maybe make the gettimeofday cheaper by doing it less often.
Like, skipping the increment if the old timestamp is newer than 1s
before the transaction start time (I think that's available free if
some other guc is enabled but I don't recall). Or isn't this cb
normally happening after transaction end? So xactStopTimestamp might
be available already?

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2022-08-23 12:17:04 Re: Letter case of "admin option"
Previous Message Isaac Morland 2022-08-23 11:30:57 Re: identifying the backend that owns a temporary schema