Re: More stats about skipped vacuums

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: More stats about skipped vacuums
Date: 2017-10-31 09:49:37
Message-ID: 20171031.184937.57727308.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

This is just a repost as a (true) new thread.

At Mon, 30 Oct 2017 20:57:50 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20171030(dot)205750(dot)246076862(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> At Fri, 20 Oct 2017 19:15:16 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoAkaw-u0feAVN_VrKZA5tvzp7jT=mQCQP-SvMegKXHHaw(at)mail(dot)gmail(dot)com>
> > > n_mod_since_analyze | 20000
> > > + vacuum_requred | true
> > > + last_vacuum_oldest_xid | 8023
> > > + last_vacuum_left_to_truncate | 5123
> > > + last_vacuum_truncated | 387
> > > last_vacuum | 2017-10-10 17:21:54.380805+09
> > > last_autovacuum | 2017-10-10 17:21:54.380805+09
> > > + last_autovacuum_status | Killed by lock conflict
> > > ...
> > > autovacuum_count | 128
> > > + incomplete_autovacuum_count | 53
> > >
> > > # The last one might be needless..
> >
> > I'm not sure that the above informations will help for users or DBA
> > but personally I sometimes want to have the number of index scans of
> > the last autovacuum in the pg_stat_user_tables view. That value
> > indicates how efficiently vacuums performed and would be a signal to
> > increase the setting of autovacuum_work_mem for user.
>
> Btree and all existing index AMs (except brin) seem to visit the
> all pages in every index scan so it would be valuable. Instead
> the number of visited index pages during a table scan might be
> usable. It is more relevant to performance than the number of
> scans, on the other hand it is a bit difficult to get something
> worth from the number in a moment. I'll show the number of scans
> in the first cut.
>
> > > Where the "Killed by lock conflict" is one of the followings.
> > >
> > > - Completed
> > > - Truncation skipped
> > > - Partially truncated
> > > - Skipped
> > > - Killed by lock conflict
> > >
> > > This seems enough to find the cause of a table bloat. The same
> > > discussion could be applied to analyze but it might be the
> > > another issue.
> > >
> > > There may be a better way to indicate the vacuum soundness. Any
> > > opinions and suggestions are welcome.
> > >
> > > I'm going to make a patch to do the 'formal' one for the time
> > > being.

Done with small modifications. In the attached patch
pg_stat_all_tables has the following new columns. Documentations
is not provided at this stage.

-----
n_mod_since_analyze | 0
+ vacuum_required | not requried
last_vacuum |
last_autovacuum | 2017-10-30 18:51:32.060551+09
last_analyze |
last_autoanalyze | 2017-10-30 18:48:33.414711+09
vacuum_count | 0
+ last_vacuum_truncated | 0
+ last_vacuum_untruncated | 0
+ last_vacuum_index_scans | 0
+ last_vacuum_oldest_xmin | 2134
+ last_vacuum_status | agressive vacuum completed
+ autovacuum_fail_count | 0
autovacuum_count | 5
analyze_count | 0
autoanalyze_count | 1
-----
Where each column shows the following infomation.

+ vacuum_required | not requried

VACUUM requirement status. Takes the following values.

- partial
Partial (or normal) will be performed by the next autovacuum.
The word "partial" is taken from the comment for
vacuum_set_xid_limits.

- aggressive
Aggressive scan will be performed by the next autovacuum.

- required
Any type of autovacuum will be performed. The type of scan is
unknown because the view failed to take the required lock on
the table. (AutoVacuumrequirement())

- not required
Next autovacuum won't perform scan on this relation.

- not required (lock not acquired)

Autovacuum should be disabled and the distance to
freeze-limit is not known because required lock is not
available.

- close to freeze-limit xid
Shown while autovacuum is disabled. The table is in the
manual vacuum window to avoid anti-wraparound autovacuum.

+ last_vacuum_truncated | 0

The number of truncated pages in the last completed
(auto)vacuum.

+ last_vacuum_untruncated | 0
The number of pages the last completed (auto)vacuum tried to
truncate but could not for some reason.

+ last_vacuum_index_scans | 0
The number of index scans performed in the last completed
(auto)vacuum.

+ last_vacuum_oldest_xmin | 2134
The oldest xmin used in the last completed (auto)vacuum.

+ last_vacuum_status | agressive vacuum completed

The finish status of the last vacuum. Takes the following
values. (pg_stat_get_last_vacuum_status())

- completed
The last partial (auto)vacuum is completed.

- vacuum full completed
The last VACUUM FULL is completed.

- aggressive vacuum completed
The last aggressive (auto)vacuum is completed.

- error while $progress
The last vacuum stopped by error while $progress.
The $progress one of the vacuum progress phases.

- canceled while $progress
The last vacuum was canceled while $progress

This is caused by user cancellation of manual vacuum or
killed by another backend who wants to acquire lock on the
relation.

- skipped - lock unavailable
The last autovacuum on the relation was skipped because
required lock was not available.

- unvacuumable
A past autovacuum tried vacuum on the relation but it is not
vacuumable for reasons of ownership or accessibility problem.
(Such relations are not shown in pg_stat_all_tables..)

+ autovacuum_fail_count | 0
The number of successive failure of vacuum on the relation.
Reset to zero by completed vacuum.

======

In the patch, vacrelstats if pointed from a static variable and
cancel reporting is performed in PG_CATCH() section in vacuum().
Every unthrown error like lock acquisition failure is reported by
explicit pgstat_report_vacuum() with the corresponding finish
code.

Vacuum requirement status is calculated in AutoVacuumRequirment()
and returned as a string. Access share lock on the target
relation is required but it returns only available values if the
lock is not available. I decided to return incomplete (coarse
grained) result than wait for a lock that isn't known to be
relased in a short time for a perfect result.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2017-10-31 09:59:30 Re: Pluggable storage
Previous Message Kyotaro HORIGUCHI 2017-10-31 09:48:07 Re: show "aggressive" or not in autovacuum logs