From: | Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Subject: | Re: New GUC autovacuum_max_threshold ? |
Date: | 2024-04-25 06:52:50 |
Message-ID: | 5b55790c-16ed-4e90-800c-678ad2655ace@dalibo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Le 24/04/2024 à 21:10, Melanie Plageman a écrit :
> On Wed, Apr 24, 2024 at 8:08 AM Frédéric Yhuel
> <frederic(dot)yhuel(at)dalibo(dot)com> wrote:
>>
>> Hello,
>>
>> I would like to suggest a new parameter, autovacuum_max_threshold, which
>> would set an upper limit on the number of tuples to delete/update/insert
>> prior to vacuum/analyze.
>
> Hi Frédéric, thanks for the proposal! You are tackling a very tough
> problem. I would also find it useful to know more about what led you
> to suggest this particular solution. I am very interested in user
> stories around difficulties with what tables are autovacuumed and
> when.
>
Hi Melanie! I can certainly start compiling user stories about that.
Recently, one of my colleagues wrote an email to our DBA team saying
something along these lines:
« Hey, here is our suggested settings for per table autovacuum
configuration:
| *autovacuum* | L < 1 million | L >= 1 million | L >= 5
millions | L >= 10 millions |
|:---------------------|--------------:|---------------:|----------------:|-----------------:|
|`vacuum_scale_factor` | 0.2 (défaut) | 0.1 | 0.05
| 0.0 |
|`vacuum_threshold` | 50 (défaut) | 50 (défaut) | 50
(défaut) | 500 000 |
|`analyze_scale_factor`| 0.1 (défaut) | 0.1 (défaut) | 0.05
| 0.0 |
|`analyze_threshold` | 50 (défaut) | 50 (défaut) | 50
(défaut) | 500 000 |
Let's update this table with values for the vacuum_insert_* parameters. »
I wasn't aware that we had this table, and although the settings made
sense to me, I thought it was rather ugly and cumbersome for the user,
and I started thinking about how postgres could make his life easier.
> Am I correct in thinking that one of the major goals here is for a
> very large table to be more likely to be vacuumed?
>
Absolutely.
>> The idea would be to replace the following calculation :
>>
>> vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples;
>>
>> with this one :
>>
>> vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples / (1
>> + vac_scale_factor * reltuples / autovacuum_max_threshold)
>>
>> (and the same for the others, vacinsthresh and anlthresh).
>
> My first thought when reviewing the GUC and how it is used is
> wondering if its description is a bit misleading.
>
> autovacuum_vacuum_threshold is the "minimum number of updated or
> deleted tuples needed to trigger a vacuum". That is, if this many
> tuples are modified, it *may* trigger a vacuum, but we also may skip
> vacuuming the table for other reasons or due to other factors.
> autovacuum_max_threshold's proposed definition is the upper
> limit/maximum number of tuples to insert/update/delete prior to
> vacuum/analyze. This implies that if that many tuples have been
> modified or inserted, the table will definitely be vacuumed -- which
> isn't true. Maybe that is okay, but I thought I would bring it up.
>
I'm not too sure I understand. What are the reasons it might by skipped?
I can think of a concurrent index creation on the same table, or
anything holding a SHARE UPDATE EXCLUSIVE lock or above. Is this the
sort of thing you are talking about?
Perhaps a better name for the GUC would be
autovacuum_asymptotic_limit... or something like that?
>> The attached (draft) patch further illustrates the idea.
>
> Thanks for including a patch!
>
>> My guess is that a similar proposal has already been submitted... and
>> rejected 🙂 If so, I'm very sorry for the useless noise.
>
> I rooted around in the hackers archive and couldn't find any threads
> on this specific proposal. I copied some other hackers I knew of who
> have worked on this problem and thought about it in the past, in case
> they know of some existing threads or prior work on this specific
> topic.
>
Thanks!
From | Date | Subject | |
---|---|---|---|
Next Message | Bharath Rupireddy | 2024-04-25 06:59:18 | Re: Add missing ConditionVariableCancelSleep() in slot.c |
Previous Message | Andrey M. Borodin | 2024-04-25 06:52:41 | Re: broken reading on standby (PostgreSQL 16.2) |