Re: Publish autovacuum informations

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
Cc: Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-02 06:30:53
Message-ID: CAB7nPqT+edmThH=rSsK4uDHmCWgycVPqVe2K4TeoBf_w+AUgfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 1, 2016 at 11:37 PM, Julien Rouhaud wrote:
> I'm not sure what are the fancy things that Michael had in mind with
> exposing the private structure. Michael, was it something like having
> the ability to change some of these data through an extension?

I was referring to you here :)
I have witnessed already many fancy things coming out of brain, and I
have no doubt you could make something out of just a sampling of data.
Jokes apart, what I mean with fancy things here is putting the
sampling data in a shape that a user suits to him. It would be easy to
exploit that for example in a background worker that scans
periodically the shared memory, or have an extension that represents
this shared memory data into something that can be queried at SQL
level. Now, the main use case that I have with the data available in
shared memory is more or less:
- represent the current shared memory as SQL
- Scan this data, reshape it and report it elsewhere, say a background
worker printing out a file.

Now, take your set of hooks... There are 4 hooks here:
- autovacuum_list_tables_hook, to do something with the list of tables
a worker has collected, at the moment they have been collected.
- autovacuum_end_table_hook, to do something when knowing that a table
is skipped or cancelled
- autovacuum_begin_table_hook, to trigger something when a relation is
beginning to be processed.
- autovacuum_database_finished_hook, to trigger something once a
database is done with its processing.

The only use cases that I have in mind here for those hooks, which
would help in decision-making to tune autovacuum-related parameters
would be the following:
- Log entries regarding those operations, then why not introducing a
GUC parameter that is an on/off switch, like log_autovacuum (this is
not a per-relation parameter), defaulting to off. Those could be used
by pgbadger.
- Have system statistics with a new system relation like
pg_stat_autovacuum, and make this information available to user.
Are there other things that you think could make use those hooks? Your
extension just does pg_stat_autovacuum and emulates the existing
pg_stat_* facility when gathering information about the global
autovacuum statistics. So it seems to me that those hooks are not that
necessary, and that this may help in tuning a system, particularly the
number of relations skipped would be interesting to have.

The stats could have a delay, the point being to have hints on how
autovacuum workers are sorting things out. In short, I am doubtful
about the need of hooks in those code paths, the thing that we should
try to do instead is to improve native solutions to give user more
information regarding how autovacuum works, which help in tuning it.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-03-02 06:34:39 Re: 2016-03 Commitfest Manager
Previous Message Ashutosh Bapat 2016-03-02 06:12:27 Re: GetExistingLocalJoinPath() vs. the docs