Re: New GUC autovacuum_max_threshold ?

From: Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: New GUC autovacuum_max_threshold ?
Date: 2024-04-25 20:57:08
Message-ID: a7403191-a4c2-4629-aebe-0106f8f0fc04@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le 25/04/2024 à 21:21, Nathan Bossart a écrit :
> On Thu, Apr 25, 2024 at 02:33:05PM -0400, Robert Haas wrote:
>> What does surprise me is that Frédéric suggests a default value of
>> 500,000. If half a million tuples (proposed default) is 20% of your
>> table (default value of autovacuum_vacuum_scale_factor) then your
>> table has 2.5 million tuples. Unless those tuples are very wide, that
>> table isn't even 1GB in size. I'm not aware that there's any problem
>> at all with the current formula on a table of that size, or even ten
>> times that size. I think you need to have tables that are hundreds of
>> gigabytes in size at least before this starts to become a serious
>> problem. Looking at this from another angle, in existing releases, the
>> maximum usable amount of autovacuum_work_mem is 1GB, which means we
>> can store one-sixth of a billion dead TIDs, or roughly 166 million.
>> And that limit has been a source of occasional complaints for years.
>> So we have those complaints on the one hand, suggesting that 166
>> million is not enough, and then we have this proposal, saying that
>> more than half a million is too much. That's really strange; my
>> initial hunch is that the value should be 100-500x higher than what
>> Frédéric proposed.
>
> Agreed, the default should probably be on the order of 100-200M minimum.
>

I'm not sure... 500000 comes from the table given in a previous message.
It may not be large enough. But vacuum also updates the visibility map,
and a few hundred thousand heap fetches can already hurt the performance
of an index-only scan, even if most of the blocs are read from cache.

> The original proposal also seems to introduce one parameter that would
> affect all three of autovacuum_vacuum_threshold,
> autovacuum_vacuum_insert_threshold, and autovacuum_analyze_threshold. Is
> that okay? Or do we need to introduce a "limit" GUC for each? I guess the
> question is whether we anticipate any need to have different values for
> these limits, which might be unlikely.
>

I agree with you, it seems unlikely. This is also an answer to Melanie's
question about the name of the GUC : I deliberately left out the other
"vacuum" because I thought we only needed one parameter for these three
thresholds.

Now I have just read Robert's new message, and I understand his point.
But is there a real problem with triggering analyze after every 500000
(or more) modifications in the table anyway?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-04-25 21:05:32 Re: Why don't we support external input/output functions for the composite types
Previous Message Robert Haas 2024-04-25 20:21:02 Re: New GUC autovacuum_max_threshold ?