Re: another autovacuum scheduling thread

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Greg Burd <greg(at)burd(dot)me>, Robert Haas <robertmhaas(at)gmail(dot)com>, Robert Treat <rob(at)xzilla(dot)net>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: another autovacuum scheduling thread
Date: 2026-03-23 19:01:22
Message-ID: CAA5RZ0tz55uoQfcvLwvRWRPM0ry6SGfZ=-enQ4F0kaUgfPh+Vg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > 4/ Is adding a reason (such as how each of these scores influenced the
> > autovacuum to pick this table) to vacuum progress reporting a good
> > idea? This helps answer some of the why and how questions when the
> > autovacuum is in progress.
>
> Yeah, adding that in addition to a system view, etc. could be nice. I'm a
> little hesitant to start making big additions to the patch at this point,
> but I can give it a whirl if folks think something like this should be
> added for v19.

Adding a system view will be nice. I am attaching a version I used in earlier
testing (cleaned up with docs), if we are inclined to get this in. I
think it will be
useful.

This follows the same setup as do_autovacuum(); scanning pg_class,
filtering relation kinds and temp tables, and computing
effective_multixact_freeze_max_age
are done in the SQL-callable function, while another wrapper
compute_autovac_score() handles
the per-relation setup (snapshotting recentXid/recentMulti, fetching
reloptions and the pgstat entry)
before calling relation_needs_vacanalyze(). The function holds an
AccessShareLock on pg_class for
the duration of the scan, so this should be relatively lightweight.

```
test=# select * from pg_stat_autovacuum_priority order by score desc ;
relid | schemaname | relname | dovacuum |
doanalyze | wraparound | score
-------+--------------------+-----------------------------+----------+-----------+------------+-----------------------
16400 | public | pgbench_accounts | t |
f | t | 1.055318563196673e+16
16404 | public | pgbench_branches | t |
t | t | 442.01666666666665
16396 | public | pgbench_tellers | t |
t | t | 172.97333333333333
16393 | public | pgbench_history | t |
t | t | 4.703261221642761
14227 | pg_toast | pg_toast_14224 | t |
f | t | 2.08555407
```
Note in the test above, I used xid_wraparound to calculate a score
with the failsafe POW()
adjustment. Notice that this is a very high score being emitted as
discussed earlier [1].
This is documented in v14 as "scaled aggressively so that the table
has a decent chance of
sorting to the top of the list."

Maybe the doc should say something like " scaled aggressively, which
can produce very large values, to ensure
the table sorts to the top of the list."

[1] [https://www.postgresql.org/message-id/CAA5RZ0vfhAnFBp4HrBQc%2BALaJMx6vCvMtnBi39ST_4nH9PZEjA%40mail.gmail.com]

--
Sami Imseih
Amazon Web Services (AWS)

Attachment Content-Type Size
v1-0001-Add-pg_stat_autovacuum_priority-view.patch application/octet-stream 15.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zsolt Parragi 2026-03-23 19:07:02 Re: Stack-based tracking of per-node WAL/buffer usage
Previous Message Maciek Sakrejda 2026-03-23 18:54:25 Re: V18 change on EXPLAIN ANALYZE