Re: another autovacuum scheduling thread

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: another autovacuum scheduling thread
Date: 2025-10-23 20:24:34
Message-ID: CAApHDvpxE8ci83d02dRE3-fMetb4Dc89-80FrjkGDz2q+ByJog@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 24 Oct 2025 at 08:33, Sami Imseih <samimseih(at)gmail(dot)com> wrote:
> Yeah, you’re correct, the list already exists; sorry I missed that. My
> main concern is
> the additional overhead of the sort operation, especially if we have
> many eligible
> tables and an aggressive autovacuum_naptime.

It is true that there are reasons that millions of tables could
suddenly become eligible for autovacuum work with the consumption of a
single xid, but I imagine sorting the list of tables is probably the
least of the DBAs worries for that case as sorting the
tables_to_process list is going to take a tiny fraction of the time
that doing the vacuum work will take.

If your concern is that the sort could take too large a portion of
someone's 1sec autovacuum_naptime instance, then you also need to
consider that the list isn't likely to be very long as there's very
little time for tables to become eligible in such a short naptime, and
if the tables are piling up because autovacuum is configured to run
too slowly, then lets fix that at the root cause rather than be
worried about improving one area because another area needs work. If
we think like that, we'll remain gridlocked and autovacuum will never
be improved. TBH, I think that mindset has likely contributed quite a
bit to the fact that we've made about zero improvements in this area
despite nobody thinking that nothing needs to be done.

There are also things that could be done if we were genuinely
concerned and had actual proof that this could reasonably be a
problem. sort_template.h would reduce the constant factor of the
indirect function call overhead by quite a bit. On a quick test here
with a table containing 1 million random float8 values, a Seq Scan and
in-memory Sort, EXPLAIN ANALYZE reports the sort took about 21ms:
(actual time=172.273..193.824). I really doubt anyone will be
concerned with 21ms when there's a list of 1 million tables needing to
be autovacuumed.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2025-10-23 20:25:15 Re: [PATCH] pg_bsd_indent: improve formatting of multiline comments
Previous Message Sami Imseih 2025-10-23 19:32:50 Re: another autovacuum scheduling thread