Re: Add pg_stat_autovacuum_priority

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Robert Treat <rob(at)xzilla(dot)net>, satyanarlapuram(at)gmail(dot)com, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add pg_stat_autovacuum_priority
Date: 2026-03-31 18:09:43
Message-ID: CALj2ACVnQRS=T8SY2zegfft=Nku1n6w5609JPdd9jqiM+G_gGg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, Mar 30, 2026 at 11:16 AM Sami Imseih <samimseih(at)gmail(dot)com> wrote:
>
> > I think we are also in agreement here, although based on my
> > experience, filtering out things like system and toast tables will be
> > common, but I don't see that changing what you said above. On a
> > similar note, +1 to your changes in v2.
>
> Thanks!
>
> v3 now includes the refactoring [1] suggestion brought up by Alvarro

Thank you for sending the latest patches. Here are some comments:

1/ + while ((classTup = heap_getnext(relScan, ForwardScanDirection)) != NULL)

Missing check_for_interrupts call while scanning the pg_class system catalog.

2/
+ avopts = extract_autovac_opts(classTup, pg_class_desc);
+
+ compute_autovac_score(classTup, pg_class_desc,
+ effective_multixact_freeze_max_age, avopts,
+ true, &dovacuum, &doanalyze,
+ &wraparound, &scores);
+
+ if (avopts)
+ pfree(avopts);
+

When a database has a large number of tables (which is quite common in
production scenarios), I expect the costs of palloc and pfree being
used for fetching autovacuum relopts would make this function slower.
Can we invent a new function or pass a caller-allocated AutoVacOpts
memory to just copy the relopts and use that in this tight loop when
scanning for all the relations?

3/
+ values[8] = Float8GetDatum(scores.vac_ins);
+ values[9] = Float8GetDatum(scores.anl);

Nit: It's a matter of taste. How about using something like below
instead of hardcoded column numbers? I expect this view to grow in the
future, so it helps to keep things simple.

values[i++] = Float8GetDatum(scores.anl);
Assert(i == NUM_AV_SCORE_COLS);

4/
+ The <link linkend="monitoring-pg-stat-autovacuum-priority-view">
+ <structname>pg_stat_autovacuum_priority</structname></link> view can be
+ used to inspect each table's autovacuum need and priority score.

How about adding "as of the moment" to convey that it doesn't report
what currently running autovacuum or pending autovacuum would
consider?

5/ Also, can we add a simple paragraph on how to interpret and take
actions based on the scores reported (like prioritizing one table over
the other - adjust these parameters in the table's relopts or
something like that - no need to cover all the possible cases, but
just one example would be sufficient for the user to understand)?

6/ + descr => 'statistics: autovacuum priority scores for all relations',

s/"for all relations"/"for all relations in the current database"

7/ Addition of force_scores to relation_needs_vacanalyze makes the
code unreadable (IMO) with a lot of if-else branching. Why not make
force_vacuum an option and pass it as true from the stats function and
leave a note in the function comment on when to use this parameter?
Would something like that work? Also, when autovacuum is disabled
(either via GUC or via relopts), we don't want to calculate and report
any scores. IMHO, this keeps things simple and code readable.

--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-03-31 18:10:56 Re: EXPLAIN: showing ReadStream / prefetch stats
Previous Message Tomas Vondra 2026-03-31 18:03:02 Re: EXPLAIN: showing ReadStream / prefetch stats