Re: row-level stats and last analyze time

From: Neil Conway <neilc(at)samurai(dot)com>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: pgsql-docs(at)postgresql(dot)org, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: row-level stats and last analyze time
Date: 2007-04-26 22:07:07
Message-ID: 1177625227.6440.61.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Tue, 2007-04-24 at 17:38 -0400, Neil Conway wrote:
> which included other modifications to reduce the pgstat I/O volume in
> 8.1. I don't think this particular change was wise

I looked into this a bit further:

(1) I believe the reasoning for Tom's earlier change was not to reduce
the I/O between the backend and the pgstat process: it was to keep the
in-memory stats hash tables small, and to reduce the amount of data that
needs to be written to disk. When the only stats messages we get for a
table are VACUUM or ANALYZE messages, we discard the message in the
pgstat daemon.

(2) If stats_row_level is false, there won't be a stats hash entry for
any tables, so we can skip sending the VACUUM or ANALYZE message in the
first place, by the same logic. (This is more debatable if the user just
disabled stats_row_level for the current session, although since only a
super-user can do that, perhaps that's OK.)

(3) I don't like the fact that the current coding is so willing to throw
away VACUUM and ANALYZE pgstat messages. I think it is quite plausible
that the DBA might be interested in the last-VACUUM and last-ANALYZE
information for a table which hasn't had live operations applied to it
recently. The rest of the pgstat code has a similarly disappointing
willingness to silently discard messages it doesn't think are worth
keeping (e.g. pgstat_recv_autovac() is ignored for databases with no
other activity, and pgstat_count_xact_commit/rollback() is a no-op
unless *either* row-level or block-level stats are enabled.)

If we're so concerned about saving space in the stats hash tables for
tables that don't see non-VACUUM / non-ANALYZE activity, why not arrange
to record the timestamps for database-wide VACUUMs and ANALYZEs
separately from table-local VACUUMs and ANALYZEs? That is, a table's
last_vacuum time could effectively be the max of the last database-wide
vacuum time and the last VACUUM on that particular table. (Recording the
time of the last database-wide VACUUM might be worth doing anyway, e.g.
for avoiding wraparound failure).

Comments?

-Neil

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Neil Conway 2007-04-26 22:17:49 Doc tweak: remove outdated discussion of stats params
Previous Message Thomas F. O'Connell 2007-04-25 20:26:19 Re: [DOCS] Incrementally Updated Backups: Docs Clarification

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2007-04-26 22:24:08 Re: row-level stats and last analyze time
Previous Message Tom Lane 2007-04-26 20:27:15 Re: About the simple_heap_update function