Quick Links

Re: another autovacuum scheduling thread

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	wenhui qiu <qiuwenhuifx(at)gmail(dot)com>
Cc:	Nathan Bossart <nathandbossart(at)gmail(dot)com>, Sami Imseih <samimseih(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: another autovacuum scheduling thread
Date:	2025-10-30 03:41:56
Message-ID:	CAApHDvrhEH8okuL+WVHxrC_W7JBSuvk6A5i3mzmrGTgp4RAvqg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, 30 Oct 2025 at 15:58, wenhui qiu <qiuwenhuifx(at)gmail(dot)com> wrote:
> In fact, with the introduction of the vacuum_max_eager_freeze_failure_rate feature, if a table’s age still exceeds more than 1.x times the autovacuum_freeze_max_age, it suggests that the vacuum freeze process is not functioning properly. Once the age surpasses vacuum_failsafe_age, wraparound issues are likely to occur soon.Taking the average of vacuum_failsafe_age and autovacuum_freeze_max_age is not a complex approach. Under the default configuration, this average already exceeds four times the autovacuum_freeze_max_age. At that stage, a DBA should have already intervened to investigate and resolve why the table age is not decreasing.

I don't think anyone would like to modify PostgreSQL in any way that
increases the chances that a table gets as old as vacuum_failsafe_age.
Regardless of the order in which tables are vacuumed, if a table gets
as old as that then vacuum is configured to run too slowly, or there
are not enough workers configured to cope with the given amount of
work. I think we need to tackle prioritisation and rate limiting as
two separate items. Nathan is proposing to improve the prioritisation
in this thread and it seems to me that your concerns are with rate
limiting. I've suggested an idea that might help with reducing the
cost_delay based on the score of the table in this thread. I'd rather
not introduce that as a topic for further discussion here (I imagine
Nathan agrees). It's not as if the server is going to consume 1
billion xids in 5 mins. It's at least going to take a day to days or
longer for that to happen and if autovacuum has not managed to get on
top of the workload in that time, then it's configured to run too
slowly and the cost_limit or delay needs to be adjusted.

My concern is that there are countless problems with autovacuum and if
you try and lump them all into a single thread to fix them all at
once, we'll get nowhere. Autovacuum was added to core in 8.1, 20 years
ago and I don't believe we've done anything to change the ratelimiting
aside from reducing the default cost_delay since then. It'd be good to
fix that at some point, just not here, please.

FWIW, I agree with Nathan about keeping the score calculation
non-magical. The score should be simple and easy to document. We can
introduce complexity to it as and when it's needed and when the
supporting evidence arrives, rather than from people waving their
hands.

David

In response to

Re: another autovacuum scheduling thread at 2025-10-30 02:58:44 from wenhui qiu

Responses

Re: another autovacuum scheduling thread at 2025-10-30 06:48:18 from wenhui qiu

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fujii Masao	2025-10-30 03:42:27	Re: display hot standby state in psql prompt
Previous Message	Corey Huinker	2025-10-30 03:40:45	Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)