Re: row-level stats and last analyze time

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, pgsql-docs(at)postgresql(dot)org, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: row-level stats and last analyze time
Date: 2007-04-26 22:24:08
Message-ID: 20070426222408.GH24522@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

Neil Conway wrote:

> (3) I don't like the fact that the current coding is so willing to throw
> away VACUUM and ANALYZE pgstat messages. I think it is quite plausible
> that the DBA might be interested in the last-VACUUM and last-ANALYZE
> information for a table which hasn't had live operations applied to it
> recently. The rest of the pgstat code has a similarly disappointing
> willingness to silently discard messages it doesn't think are worth
> keeping (e.g. pgstat_recv_autovac() is ignored for databases with no
> other activity, and pgstat_count_xact_commit/rollback() is a no-op
> unless *either* row-level or block-level stats are enabled.)

One thing to keep in mind is that autovac drives some decision from
whether the database has a pgstat entry or not. In particular it means
it doesn't bother processing non-connectable databases, unless they are
close to Xid wraparound.

I think this behavior is a useful one, since usually vacuuming those
databases is a waste of time anyway. Whether to drive it from pgstat or
from somewhere else is another matter, but if you want to drive it from
another mechanism, keep in mind that the autovacuum launcher (which is
the process that makes this decision) is not connected to any database
so it cannot examine any catalog's content. There are of course ways
around that: for example you could put the information in the
pg_database flatfile. But it's something to keep in mind if you want to
change it.

> If we're so concerned about saving space in the stats hash tables for
> tables that don't see non-VACUUM / non-ANALYZE activity, why not arrange
> to record the timestamps for database-wide VACUUMs and ANALYZEs
> separately from table-local VACUUMs and ANALYZEs? That is, a table's
> last_vacuum time could effectively be the max of the last database-wide
> vacuum time and the last VACUUM on that particular table. (Recording the
> time of the last database-wide VACUUM might be worth doing anyway, e.g.
> for avoiding wraparound failure).

Another thing to keep in mind is that autovacuum does not do
database-wide vacuums anymore -- they are not needed. Xid wraparound
decisions are handled on a table-by-table basis, so information about
when the last database-wide vacuum was is not needed.

Note that Xid wraparound decisions are driven by information in
pg_class. So it's not a problem that pgstat may lose the info from this
POV.

The bottom line is that the current pgstat behavior and autovacuum are
closely related. So if you want to change pgstats you should also keep
an eye on how it's going to affect autovac.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Brian Gough 2007-05-03 14:32:33 patch for doubled words
Previous Message Neil Conway 2007-04-26 22:17:49 Doc tweak: remove outdated discussion of stats params

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2007-04-26 22:51:58 Re: Interaction of PITR backups and Bulkoperationsavoiding WAL
Previous Message Neil Conway 2007-04-26 22:07:07 Re: row-level stats and last analyze time