Re: another autovacuum scheduling thread

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Jeremy Schneider <schneider(at)ardentperf(dot)com>
Cc: Sami Imseih <samimseih(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: another autovacuum scheduling thread
Date: 2025-10-09 03:13:23
Message-ID: CAApHDvq76BBseUh2cG0=m=8r-j6HF_jrQt16Eszgsxp3bciGQw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 9 Oct 2025 at 14:47, Jeremy Schneider <schneider(at)ardentperf(dot)com> wrote:
>
> On Wed, 8 Oct 2025 18:25:20 -0700
> Jeremy Schneider <schneider(at)ardentperf(dot)com> wrote:
> > If users are tuning this thing then I feel like we've already lost the
> > battle :)
>
> I replied too quickly. Re-reading your email, I think your proposing a
> different algorithm, taking tuple counts into account. No tunables. Is
> there a fully fleshed out version of the proposed alternative algorithm
> somewhere? (one of the older threads?) I guess this is why its so hard
> to get anything committed in this area...

It's along the lines of the "1a)" from [1]. I don't think that post
does a great job of explaining it.

I think the best way to understand it is if you look at
relation_needs_vacanalyze() and see how it calculates boolean values
for boolean output params. So, instead of calculating just a boolean
value it instead calculates a float4 where < 1.0 means don't do the
operation and anything >= 1.0 means do the operation. For example,
let's say a table has 600 dead rows and the scale factor and threshold
settings mean that autovacuum will trigger at 200 (3 times more dead
tuples than the trigger point). That would result in the value of 3.0
(600 / 200). The priority for relfrozenxid portion is basically
age(relfrozenxid) / autovacuum_freeze_max_age (plus need to account
for mxid by doing the same for that and taking the maximum of each
value). For each of those component "scores", the priority for
autovacuum would be the maximum of each of those.

Effectively, it's a method of aligning the different units of measure,
transactions or tuples into a single value which is calculated based
on the very same values that we use today to trigger autovacuums.

David

[1] https://postgr.es/m/CAApHDvo8DWyt4CWhF=NPeRstz_78SteEuuNDfYO7cjp=7YTK4g@mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2025-10-09 03:28:01 Re: Clarification on Role Access Rights to Table Indexes
Previous Message Richard Guo 2025-10-09 03:10:31 Re: Eager aggregation, take 3