Re: Should vacuum process config file reload more often

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Should vacuum process config file reload more often
Date: 2023-04-07 13:23:13
Message-ID: 23D2F3D0-90C2-428A-817F-1E16194DDC2D@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 7 Apr 2023, at 15:07, Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:
> On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:

>> + /* Only log updates to cost-related variables */
>> + if (vacuum_cost_delay == original_cost_delay &&
>> + vacuum_cost_limit == original_cost_limit)
>> + return;
>>
>> IIUC by default, we log not only before starting the vacuum but also
>> when changing cost-related variables. Which is good, I think, because
>> logging the initial values would also be helpful for investigation.
>> However, I think that we don't log the initial vacuum cost values
>> depending on the values. For example, if the
>> autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
>> the initial values. I think that instead of comparing old and new
>> values, we can write the log only if
>> message_level_is_interesting(DEBUG2) is true. That way, we don't need
>> to acquire the lwlock unnecessarily. And the code looks cleaner to me.
>> I've attached the patch (use_message_level_is_interesting.patch)
>
> Thanks for coming up with the case you thought of with storage param for
> cost delay = 0. In that case we wouldn't print the message initially and
> we should fix that.
>
> I disagree, however, that we should condition it only on
> message_level_is_interesting().

I think we should keep the logging frequency as committed, but condition taking
the lock on message_level_is_interesting().

> Actually, outside of printing initial values when the autovacuum worker
> first starts (before vacuuming all tables), I don't think we should log
> these values except when they are being updated. Autovacuum workers
> could vacuum tons of small tables and having this print out at least
> once per table (which I know is how it is on master) would be
> distracting. Also, you could be reloading the config to update some
> other GUCs and be oblivious to an ongoing autovacuum and get these
> messages printed out, which I would also find distracting.
>
> You will have to stare very hard at the logs to tell if your changes to
> vacuum cost delay and limit took effect when you reload config. I think
> with our changes to update the values more often, we should take the
> opportunity to make this logging more useful by making it happen only
> when the values are changed.
>
> I would be open to elevating the log level to DEBUG1 for logging only
> updates and, perhaps, having an option if you set log level to DEBUG2,
> for example, to always log these values in VacuumUpdateCosts().
>
> I'd even argue that, potentially, having the cost-delay related
> parameters printed at the beginning of vacuuming could be interesting to
> regular VACUUM as well (even though it doesn't benefit from config
> reload while in progress).
>
> To fix the issue you mentioned and ensure the logging is printed when
> autovacuum workers start up before vacuuming tables, we could either
> initialize vacuum_cost_delay and vacuum_cost_limit to something invalid
> that will always be different than what they are set to in
> VacuumUpdateCosts() (not sure if this poses a problem for VACUUM using
> these values since they are set to the defaults for VACUUM). Or, we
> could duplicate this logging message in do_autovacuum().

Duplicating logging, maybe with a slightly tailored message, seem the least
bad option.

> Finally, one other point about message_level_is_interesting(). I liked
> the idea of using it a lot, since log level DEBUG2 will not be the
> common case. I thought of it but hesitated because all other users of
> message_level_is_interesting() are avoiding some memory allocation or
> string copying -- not avoiding take a lock. Making this conditioned on
> log level made me a bit uncomfortable. I can't think of a situation when
> it would be a problem, but it felt a bit off.

Considering how uncommon DEBUG2 will be in production, I think conditioning
taking a lock on it makes sense.

>> Also, while testing the autovacuum delay with relopt
>> autovacuum_vacuum_cost_delay = 0, I realized that even if we set
>> autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
>> true. wi_dobalance comes from the following expression:
>>
>> /*
>> * If any of the cost delay parameters has been set individually for
>> * this table, disable the balancing algorithm.
>> */
>> tab->at_dobalance =
>> !(avopts && (avopts->vacuum_cost_limit > 0 ||
>> avopts->vacuum_cost_delay > 0));
>>
>> The initial values of both avopts->vacuum_cost_limit and
>> avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
>> of "> 0". Otherwise, we include the autovacuum worker working on a
>> table whose autovacuum_vacuum_cost_delay is 0 to the balancing
>> algorithm. Probably this behavior has existed also on back branches
>> but I haven't checked it yet.
>
> Thank you for catching this. Indeed this exists in master since
> 1021bd6a89b which was backpatched. I checked and it is true all the way
> back through REL_11_STABLE.
>
> Definitely seems worth fixing as it kind of defeats the purpose of the
> original commit. I wish I had noticed before!
>
> Your fix has:
> !(avopts && (avopts->vacuum_cost_limit >= 0 ||
> avopts->vacuum_cost_delay >= 0));
>
> And though delay is required to be >= 0
> avopts->vacuum_cost_delay >= 0
>
> Limit does not. It can just be > 0.
>
> postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0);
> ERROR: value 0 out of bounds for option "autovacuum_vacuum_cost_limit"
> DETAIL: Valid values are between "1" and "10000".
>
> Though >= is also fine, the rest of the code in all versions always
> checks if limit > 0 and delay >= 0 since 0 is a valid value for delay
> and not for limit. Probably best we keep it consistent (though the whole
> thing is quite confusing).

+1

--
Daniel Gustafsson

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2023-04-07 13:28:43 Re: CREATE SUBSCRIPTION -- add missing tab-completes
Previous Message Melanie Plageman 2023-04-07 13:07:46 Re: Should vacuum process config file reload more often