Re: Track skipped tables during autovacuum and autoanalyze

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Track skipped tables during autovacuum and autoanalyze
Date: 2026-03-24 14:58:48
Message-ID: CAA5RZ0snnePNW1NKGKh+NyJ1CY26T5F_6-tTq+BHWM2kj1fN1g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for the patch!

> The attached patch add the following fields to pg_stat_all_tables:
> - last_skipped_autovacuum
> - last_skipped_autoanalyze
> - skipped_autovacuum_count
> - skipped_autoanalyze_count
>
> Are there any concerns about exposing this in pg_stat_all_tables, or suggestions
> for a better approach?

I am not sure about the timestamp columns. I am not saying they will
not be useful,
but I think it will be better to just start with counters for this.
The way the views get
used, a dashboard built for tracking the deltas of the counters can easily spot
when there is a spike of skipped autovacuum/autoanalyze count.
Also, for tables that are being autovacuumed and skipped quickly,
the timestamps will just be overwritten.

So, I am +1 on the counters, -1 on the timestamps.

Out of scope for this patch, but I also wonder if we should add another counter,
autovacuum_started_count. If there are other types of failure scenarios such as
corrupt indexes, checksum failures, etc. which terminate the
autovacuum in flight,
we would be able to catch this by looking at the number of autovacuums
started vs completed. The skipped counters in this patch and a started
counter would capture different stages of the autovacuum lifecycle;
skipped means
"never started" (lock contention), while a started-minus-completed delta means
"started but failed." Both are useful signals, but for different reasons.

In terms of the patch:

1/

+ if (AmAutoVacuumWorkerProcess())
+ pgstat_report_skipped_vacuum(relid);

Any reason why this should not also include manual vacuum/analyze?
If someone has a vacuum/analyze script that uses SKIP_LOCKED, and
the operation gets skipped, this should be included in the counter.
this can be done with separate counter fields for autovacuum/autoanalyze and
vacuum/analyze

2/

+ pg_stat_get_skipped_autovacuum_count(C.oid) AS
skipped_autovacuum_count,

How about a name like "autovacuum_lock_skip_count"?

--
Sami Imseih
Amazon Web Services (AWS)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Henson Choi 2026-03-24 15:07:30 Re: SQL Property Graph Queries (SQL/PGQ)
Previous Message Nathan Bossart 2026-03-24 14:34:23 Re: BUG: test_bloomfilter error message reports wrong variable and wrong format specifier