Re: [Proposal] More Vacuum Statistics

From: Naoya Anzai <nao-anzai(at)xc(dot)jp(dot)nec(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc: Akio Iwaasa <aki-iwaasa(at)vt(dot)jp(dot)nec(dot)com>, "bench(dot)coffee(at)gmail(dot)com" <bench(dot)coffee(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, "tomas(dot)vondra(at)2ndquadrant(dot)com" <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: [Proposal] More Vacuum Statistics
Date: 2015-06-05 12:10:44
Message-ID: 116262CF971C844FB6E793F8809B51C6E95F3A@BPXM02GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for quick feedback, and I'm sorry for slow response.
All of your opinions were very helpful for me.

I have confirmed Greg's Idea "Timing events".
http://www.postgresql.org/message-id/509300F7.5000803@2ndQuadrant.com

Greg said at first,
"Parsing log files for commonly needed performance data is no fun."
Yes, I completely agree with him.

That looks a nice idea but I don't know why this idea has
not been commited yet. Anybody knows?

I have reworked my idea since I heard dear hacker's opinions.

====================
pg_stat_vacuum view
====================

I understand it is not good to simply add more counters in
pg_stat_*_tables. For now, I'd like to suggest an extension
which can confirm vacuum statistics like pg_stat_statements.

VACUUM is a most important feature in PostgreSQL, but a
special view for vacuum does not exist. Don't you think
the fact is inconvenience? At least, I am disgruntled with
that we need to parse pg_log for tune VACUUM.

My first design of pg_stat_vacuum view is following.
(There are two views.)

pg_stat_vacuum_table
---------------
dbid
schemaname
relid
relname
elapsed
page_removed
page_remain
page_skipped
tuple_removed
tuple_remain
tuple_notremovable
buffer_hit
buffer_miss
buffer_dirty
avg_read
avg_write
vm_count
vac_start
vac_end
is_autovacuum

pg_stat_vacuum_index
---------------
dbid
shemaname
relid
indexrelid
indexname
elapsed
num_index_tuples
num_pages
tuples_removed
pages_deleted
pages_free
is_autovacuum

At present, I think memory design of pg_stat_statements can
divert into this feature.And I think this module needs to
prepare following parameters like pg_stat_statements.

pg_stat_vacuum.max(integer)
pg_stat_vacuum.save(boolean)
pg_stat_vacuum.excluded_dbnames(text)
pg_stat_vacuum.excluded_schemas(text)
pg_stat_vacuum.min_duration(integer)
... and so on.

To implement this feature, I have to collect each vacuum-stats
every lazy_vacuum_* and I need to embed a hook function point
where needed. (probably last point of lazy_vacuum_rel).
Do you hesitate to add the hook only for this function?

Similar feature has been already provided by pg_statsinfo package.
But it is a full-stack package for PG-stats and it needs to
redesign pg_log and design a repository database for introduce.
And it is not a core-extension for PostgreSQL.
(I don't intend to hate pg_statsinfo,
I think this package is a very convinient tool)

Everyone will be able to do more easily tuning of VACUUM.
That's all I want.

Any comments are welcome!

Best Regards,

Naoya Anzai

---
Naoya Anzai
Engineering Department
NEC Solution Inovetors, Ltd.
E-Mail: nao-anzai(at)xc(dot)jp(dot)nec(dot)com
---

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-06-05 12:21:45 Re: nested loop semijoin estimates
Previous Message Robert Haas 2015-06-05 11:56:14 Re: RFC: Remove contrib entirely