Quick Links

Re: another autovacuum scheduling thread

From:	Sami Imseih <samimseih(at)gmail(dot)com>
To:	David Rowley <dgrowleyml(at)gmail(dot)com>
Cc:	Nathan Bossart <nathandbossart(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: another autovacuum scheduling thread
Date:	2025-11-11 20:25:36
Message-ID:	CAA5RZ0sJGg209gSpEzLpO3DPiF8r8n_xVbQDnd_92BV+-5kvDA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> On Sat, 8 Nov 2025 at 08:23, Sami Imseih <samimseih(at)gmail(dot)com> wrote:
> > > I'm confused at why we'd have set up our autovacuum trigger points as
> > > they are today because we think those are good times to do a
> > > vacuum/analyze, but then prioritise on something completely different.
> > > Surely if we think 20% dead tuples is worth a vacuum, we must
> > > therefore think that 40% dead tuples are even more worthwhile?!
> >
> > Sure, but thresholds alone don't indicate anything about the how quick
> > the table can be vacuumed, # of indexes, per table a/v settings, etc.
> > The average a/v time is a good proxy to determine this.
> >
> > What I am suggesting here is we think beyond thresholds for
> > prioritization, and to give a chance for more eligible tables to get
> > autovacuumed rather than workers being saturated on some
> > of the slowest-to-vacuum tables.
>
> Can you define "more eligible" here?

What I mean by “more eligible” is that once a worker has its list of tables
that meet the autovacuum thresholds, it’s trying to get through as many
of them as possible within some time window.

If the workers always go after the slowest tables first, they’ll spend most
of that time on just a few heavy ones, and a lot of other eligible tables might
end up waiting much longer to get processed.

Eventually the slow tables will be the bottleneck anyway.

> I think I'm not really grasping this because I don't understand why
> faster-to-vacuum tables should be prioritised over slower-to-vacuum
> tables. Can you explain why you think this is important?

The thing I’m hoping to address is something I’ve seen many times in practice.
Autovacuum workers can get stuck on specific large or slow tables, and when
that happens, users often end up running manual vacuums on those tables
just to keep things moving for the smaller/faster vacuumed tables.

Now, I am not so sure any type of autovacuum prioritization could actually
help in these cases. What does help is adding more autovacuum workers.

> if we have the autovacuum worker refresh the list and scores after
> it's done with a table and autovacuum_naptime has elapsed since the
> list was last refreshed?

That is an interesting idea, but refreshing the list that often may not
be such a good idea, it could be quite expensive on large catalogs.

--
Sami Imseih
Amazon Web Services (AWS)

In response to

Re: another autovacuum scheduling thread at 2025-11-11 00:58:22 from David Rowley

Responses

Re: another autovacuum scheduling thread at 2025-11-11 20:43:46 from David Rowley

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	David Rowley	2025-11-11 20:26:48	Re: another autovacuum scheduling thread
Previous Message	Nathan Bossart	2025-11-11 20:16:37	Re: another autovacuum scheduling thread