Re: Why our counters need to be time-based WAS: WIP: cross column correlation ...

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why our counters need to be time-based WAS: WIP: cross column correlation ...
Date: 2011-02-28 18:04:54
Message-ID: 4D6BE3C6.1000609@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> Well, what we have now is a bunch of counters in pg_stat_all_tables
> and pg_statio_all_tables.

Right. What I'm saying is those aren't good enough, and have never
been good enough. Counters without a time basis are pretty much useless
for performance monitoring/management (Baron Schwartz has a blog post
talking about this, but I can't find it right now).

Take, for example, a problem I was recently grappling with for Nagios.
I'd like to do a check as to whether or not tables are getting
autoanalyzed often enough. After all, autovac can fall behind, and we'd
want to be alerted of that.

The problem is, in order to measure whether or not autoanalyze is
behind, you need to count how many inserts,updates,deletes have happened
since the last autoanalyze. pg_stat_user_tables just gives us the
counters since the last reset ... and the reset time isn't even stored
in PostgreSQL.

This means that, without adding external tools like pg_statsinfo, we
can't autotune autoanalyze at all.

There are quite a few other examples where the counters could contribute
to autotuning and DBA performance monitoring if only they were
time-based. As it is, they're useful for finding unused indexes and
that's about it.

One possibility, of course, would be to take pg_statsinfo and make it
part of core. There's a couple disadvantages of that; (1) is the
storage and extra objects required, which would then require us to add
extra management routines as well. (2) is that pg_statsinfo only stores
top-level view history, meaning that it wouldn't be very adaptable to
improvements we make in system views in the future.

On the other hand, anything which increases the size of pg_statistic
would be a nightmare.

One possible compromise solution might be to implement code for the
stats collector to automatically reset the stats at a given clock
interval. If we combined this with keeping the reset time, and keeping
a snapshot of the stats from the last clock tick (and their reset time)
that would be "good enough" for most monitoring.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Martijn van Oosterhout 2011-02-28 18:19:46 Re: Why our counters need to be time-based WAS: WIP: cross column correlation ...
Previous Message Tom Lane 2011-02-28 17:59:56 Re: Snapshot synchronization, again...