Re: Add pg_stat_autovacuum_priority

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Robert Treat <rob(at)xzilla(dot)net>, satyanarlapuram(at)gmail(dot)com, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add pg_stat_autovacuum_priority
Date: 2026-04-03 19:13:16
Message-ID: CAA5RZ0sCRjH3xkHFdSXnKysdMZXFyaS_094+K-O_rr4Fkmwc=Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Alright, I've been preparing these for commit. Most changes are cosmetic,
> but there are a couple of bigger ones I should note:

Thanks!

> * I added a prerequisite patch for relation_needs_vacanalyze() that saves a
> level of indentation on a chunk of code. This simplifies 0001 (now 0002) a
> bit.

I like this this.

> * I noticed that if autovacuum decides to force a vacuum for
> anti-wraparound purposes, it might also decide to analyze the table even if
> autovacuum is disabled for it. AFAICT this is accidental, but since it's
> behaved this way since commit 48188e1621 (2006) [0], I am slightly worried
> that this bug may have become a feature. In 0002, I separated this edge
> case in the code and added a comment, and I intend to start a new thread
> about removing it.

hmm yeah, I think this just needs to be documented clearly. I always
thought it was expected for auto-analyze to run in this case, and I don't
see why it shouldn't. If this needs to be clarified in docs, we should
do that in a separate discussion.

> * I removed the booleans in the view in favor of just noting that scores >=
> 1.0 means the table needs processing. IMHO trying to distinguish
> needs_vacuum from do_vacuum is just going to confuse folks more than
> anything, and IIUC this information is redundant with "score >= 1.0",
> anyway.

That's fine by me.

> * I renamed the view to pg_autovacuum_scores. While some of the
> information in the view depends on cumulative statistics, not all of it
> does, and what does is quite heavily modified from the original stats. So,
> I didn't think the pg_stat_* prefix was appropriate, although I can see how
> reasonable people might disagree.

Initially I thought about moving this away from the cumulative stats section,
but this view does need to lookup relation stats and if relation stats
are reset,
the same rules will apply to this view.

Also not all views under "cumulative stats" are necessarily cumulative.
Some just show real-time data; pg_stat_activity, pg_stat_progress_*, etc.

This view does not have precedent in the type of work it does, but I do
really think it belongs under pg_stat_*, and not be too far away conceptually
from the vacuum stats in pg_stat_all_tables.

> * I considered whether to make the backing function per-table and
> ultimately decided against it. The initialization logic is a bit
> expensive, and I assume most folks will be interested in the full picture
> of the current database. Maybe we could add a per-table function down the
> road, but I don't see any strong need for that for now.

Yes, I did not proceed with this since the common use-case will be comparing
tables in a broader context. I don't see a string single table use-case at this
point.

Besides the above, I have one comment on 0005:

Where it says "indicate the table needs analyzing" or "needs processing"
or "needs vacuuming", we should instead say "may need". Since the
actually processing depends on the thresholds or force vacuum conditions,
but no need to go into that level of detail in the row descriptions. That is
all explained in the existing autovacuum prioritization docs.

--
Sami Imseih
Amazon Web Services (AWS)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2026-04-03 19:20:16 Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE
Previous Message Pavel Stehule 2026-04-03 19:10:13 Re: proposal: schema variables