Re: [BUG] Autovacuum not dynamically decreasing cost_limit and cost_delay

From: David Zhang <david(dot)zhang(at)highgo(dot)ca>
To: "Mead, Scott" <meads(at)amazon(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG] Autovacuum not dynamically decreasing cost_limit and cost_delay
Date: 2021-02-12 20:03:56
Message-ID: 4e980c7d-9997-5ab7-472c-75f377b26a76@highgo.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Thanks for the patch, Mead.

For 'MAXVACUUMCOSTLIMIT", it would be nice to follow the current GUC
pattern to do define a constant.

For example, the constant "MAX_KILOBYTES" is defined in guc.h, with a
pattern like, "MAX_" to make it easy to read.

Best regards,

David

On 2021-02-08 6:48 a.m., Mead, Scott wrote:
> Hello,
>    I recently looked at what it would take to make a running
> autovacuum pick-up a change to either cost_delay or cost_limit.  Users
> frequently will have a conservative value set, and then wish to change
> it when autovacuum initiates a freeze on a relation.  Most users end
> up finding out they are in ‘to prevent wraparound’ after it has
> happened, this means that if they want the vacuum to take advantage of
> more I/O, they need to stop and then restart the currently running
> vacuum (after reloading the GUCs).
>   Initially, my goal was to determine feasibility for making this
> dynamic.  I added debug code to vacuum.c:vacuum_delay_point(void) and
> found that changes to cost_delay and cost_limit are already processed
> by a running vacuum.  There was a bug preventing the cost_delay or
> cost_limit from being configured to allow higher throughput however.
> I believe this is a bug because currently, autovacuum will dynamically
> detect and /increase/ the cost_limit or cost_delay, but it can never
> decrease those values beyond their setting when the vacuum began.  The
> current behavior is for vacuum to limit the maximum throughput of
> currently running vacuum processes to the cost_limit that was set when
> the vacuum process began.
> I changed this (see attached) to allow the cost_limit to be
> re-calculated up to the maximum allowable (currently 10,000).  This
> has the effect of allowing users to reload a configuration change and
> an in-progress vacuum can be ‘sped-up’ by setting either the
> cost_limit or cost_delay.
> The problematic piece is:
> diff --git a/src/backend/postmaster/autovacuum.c
> b/src/backend/postmaster/autovacuum.c
> index c6ec657a93..d3c6b0d805 100644
> --- a/src/backend/postmaster/autovacuum.c
> +++ b/src/backend/postmaster/autovacuum.c
> @@ -1834,7 +1834,7 @@ autovac_balance_cost(void)
> * cost_limit to more than the base value.
> */
> worker->wi_cost_limit = *Max(Min(limit,*
> *- worker->wi_cost_limit_base*),
> +                                 MAXVACUUMCOSTLIMIT),
> 1);
> }
> We limit the worker to the max cost_limit that was set at the
> beginning of the vacuum.  I introduced the MAXVACUUMCOSTLIMIT constant
> (currently defined to 10000, which is the currently max limit already
> defined) in miscadmin.h so that vacuum will now be able to adjust the
> cost_limit up to 10000 as the upper limit in a currently running vacuum.
>
> The tests that I’ve run show that the performance of an existing
> vacuum can be increased commensurate with the parameter change.
>  Interestingly, /autovac_balance_cost(void) /is only updating the
> cost_limit, even if the cost_delay is modified.  This is done
> correctly, it was just a surprise to see the behavior.
>
>
> 2021-02-01 13:36:52.346 EST [37891] DEBUG:  VACUUM Sleep: Delay:
> 20.000000, CostBalance: 207, CostLimit: *200*, msec: 20.700000
> 2021-02-01 13:36:52.346 EST [37891] CONTEXT:  while scanning block
> 1824 of relation "public.blah"
> 2021-02-01 13:36:52.362 EST [36460] LOG:  received SIGHUP, reloading
> configuration files
> *
> *
> *2021-02-01 13:36:52.364 EST [36460] LOG:  parameter
> "autovacuum_vacuum_cost_delay" changed to "2"*
> \
> 2021-02-01 13:36:52.365 EST [36463] DEBUG:  checkpointer updated
> shared memory configuration values
> 2021-02-01 13:36:52.366 EST [36466] DEBUG:
>  autovac_balance_cost(pid=37891 db=13207, rel=16384, dobalance=yes
> cost_limit=2000, cost_limit_base=200, cost_delay=20)
>
> 2021-02-01 13:36:52.366 EST [36467] DEBUG:  received inquiry for
> database 0
> 2021-02-01 13:36:52.366 EST [36467] DEBUG:  writing stats file
> "pg_stat_tmp/global.stat"
> 2021-02-01 13:36:52.366 EST [36467] DEBUG:  writing stats file
> "pg_stat_tmp/db_0.stat"
> 2021-02-01 13:36:52.388 EST [37891] DEBUG:  VACUUM Sleep: Delay:
> 20.000000, CostBalance: 2001, CostLimit: 2000, msec: 20.010000
>
--
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2021-02-12 20:37:41 Re: BUG #16863: Assert failed in set_plain_rel_size() on processing ~* with a long prefix
Previous Message Alexander Lakhin 2021-02-12 20:00:01 Re: BUG #16863: Assert failed in set_plain_rel_size() on processing ~* with a long prefix

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2021-02-12 20:59:49 Re: [HACKERS] GSoC 2017: Foreign Key Arrays
Previous Message Mark Rofail 2021-02-12 19:56:42 Re: [HACKERS] GSoC 2017: Foreign Key Arrays