Re: [BUG] Autovacuum not dynamically decreasing cost_limit and cost_delay

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: "Mead, Scott" <meads(at)amazon(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG] Autovacuum not dynamically decreasing cost_limit and cost_delay
Date: 2023-02-23 22:22:14
Message-ID: CAAKRu_Zt5FkWdiJ-55D7VMDnKygmHJ6+YWzmNribCgavAk2pXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Mon, Feb 8, 2021 at 9:49 AM Mead, Scott <meads(at)amazon(dot)com> wrote:
> Initially, my goal was to determine feasibility for making this dynamic. I added debug code to vacuum.c:vacuum_delay_point(void) and found that changes to cost_delay and cost_limit are already processed by a running vacuum. There was a bug preventing the cost_delay or cost_limit from being configured to allow higher throughput however.
>
> I believe this is a bug because currently, autovacuum will dynamically detect and increase the cost_limit or cost_delay, but it can never decrease those values beyond their setting when the vacuum began. The current behavior is for vacuum to limit the maximum throughput of currently running vacuum processes to the cost_limit that was set when the vacuum process began.
>
> I changed this (see attached) to allow the cost_limit to be re-calculated up to the maximum allowable (currently 10,000). This has the effect of allowing users to reload a configuration change and an in-progress vacuum can be ‘sped-up’ by setting either the cost_limit or cost_delay.
>
> The problematic piece is:
>
> diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
> index c6ec657a93..d3c6b0d805 100644
> --- a/src/backend/postmaster/autovacuum.c
> +++ b/src/backend/postmaster/autovacuum.c
> @@ -1834,7 +1834,7 @@ autovac_balance_cost(void)
> * cost_limit to more than the base value.
> */
> worker->wi_cost_limit = Max(Min(limit,
> - worker->wi_cost_limit_base),
> + MAXVACUUMCOSTLIMIT),
> 1);
> }
>
> We limit the worker to the max cost_limit that was set at the beginning of the vacuum.

So, in do_autovacuum() in the loop through all relations we will be
vacuuming (around line 2308) (comment says "perform operations on
collected tables"), we will reload the config file first before
operating on that table [1]. Any changes you have made to
autovacuum_vacuum_cost_limit or other GUCs will be read and changed
here.

Later in this same loop, table_recheck_autovac() will set
tab->at_vacuum_cost_limit from vac_cost_limit which is set from the
autovacuum_vacuum_cost_limit or vacuum_cost_limit and will pick up your
refreshed value.

Then a bit further down, (before autovac_balance_cost()),
MyWorkerInfo->wi_cost_limit_base is set from tab->at_vacuum_cost_limit.

In autovac_balance_cost(), when we loop through the running workers to
calculate the worker->wi_cost_limit, workers who have reloaded the
config file in the do_autovacuum() loop prior to our taking the
AutovacuumLock will have the new version of autovacuum_vacuum_cost_limit
in their wi_cost_limit_base.

If you saw an old value in the DEBUG log output, that could be
because it was for a worker who has not yet reloaded the config file.
(the launcher also calls autovac_balance_cost(), but I will
assume we are just talking about the workers here).

Note that this will only pick up changes between tables being
autovacuumed. If you want to see updates to the value in the middle of
autovacuum vacuuming a table, then we would need to reload the
configuration file more often than just between tables.

I have started a discussion about doing this in [2]. I made it a
separate thread because my proposed changes would have effects outside
of autovacuum. Processing the config file reload in vacuum_delay_point()
would affect vacuum and analyze (i.e. not just autovacuum). Explicit
vacuum and analyze rely on the per statement config reload in
PostgresMain().

> Interestingly, autovac_balance_cost(void) is only updating the cost_limit, even if the cost_delay is modified. This is done correctly, it was just a surprise to see the behavior.

If this was during vacuuming of a single table, this is expected for the
same reason described above.

- Melanie

[1] https://github.com/postgres/postgres/blob/master/src/backend/postmaster/autovacuum.c#L2324
[2] https://www.postgresql.org/message-id/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2023-02-23 22:33:30 BUG #17806: PostgreSQL 13.10 returns "CREATE DATABASE cannot be executed within a pipeline"
Previous Message Tom Lane 2023-02-23 22:06:03 Re: BUG #17800: ON CONFLICT DO UPDATE fails to detect incompatible fields that leads to a server crash

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2023-02-23 22:30:59 Re: [PATCH] Add pretty-printed XML output option
Previous Message Melanie Plageman 2023-02-23 22:08:16 Should vacuum process config file reload more often