Re: Should vacuum process config file reload more often

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Should vacuum process config file reload more often
Date: 2023-03-02 07:36:09
Message-ID: CAD21AoB=EC6wzjjqDXKmNEhu_BhGg0g_Gc50yh_zvXqiDC_OSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Mon, Feb 27, 2023 at 9:12 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > On Fri, Feb 24, 2023 at 7:08 AM Melanie Plageman
> > <melanieplageman(at)gmail(dot)com> wrote:
> > > Users may wish to speed up long-running vacuum of a large table by
> > > decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
> > > config file is only reloaded between tables (for autovacuum) or after
> > > the statement (for explicit vacuum). This has been brought up for
> > > autovacuum in [1].
> > >
> > > Andres suggested that it might be possible to check ConfigReloadPending
> > > in vacuum_delay_point(), so I thought I would draft a rough patch and
> > > start a discussion.
> >
> > In vacuum_delay_point(), we need to update VacuumCostActive too if necessary.
>
> Yes, good point. Thank you!
>
> On Thu, Feb 23, 2023 at 5:08 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> > I don't think I can set and leave vac_in_outer_xact the way I am doing
> > it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
> > which I believe is reachable from codepaths that would not have called
> > vacuum(). It seems that if a backend sets it, the outer transaction
> > commits, and then the backend ends up calling vacuum_delay_point() in a
> > different way later, it wouldn't be quite right.
>
> Perhaps I could just set in_outer_xact to false in the PG_FINALLY()
> section in vacuum() to avoid this problem.
>
> On Wed, Mar 1, 2023 at 7:15 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:
> > > On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > > On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
> > > > > Also, I'm concerned that allowing to change any GUC parameters during
> > > > > vacuum/analyze could be a foot-gun in the future. When modifying
> > > > > vacuum/analyze-related codes, we have to consider the case where any GUC
> > > > > parameters could be changed during vacuum/analyze.
> > > >
> > > > What kind of scenario are you thinking of?
> > >
> > > For example, I guess we will need to take care of changes of
> > > maintenance_work_mem. Currently we initialize the dead tuple space at
> > > the beginning of lazy vacuum, but perhaps we would need to
> > > enlarge/shrink it based on the new value?
> >
> > I don't think we need to do anything about that initially, just because the
> > config can be changed in a more granular way, doesn't mean we have to react to
> > every change for the current operation.
>
> Perhaps we can mention in the docs that a change to maintenance_work_mem
> will not take effect in the middle of vacuuming a table. But, Ithink it probably
> isn't needed.

Agreed.

>
> On another topic, I've just realized that when autovacuuming we only
> update tab->at_vacuum_cost_delay/limit from
> autovacuum_vacuum_cost_delay/limit for each table (in
> table_recheck_autovac()) and then use that to update
> MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
> what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
> So, even if we reload the config file in vacuum_delay_point(), if we
> don't use the new value of autovacuum_vacuum_cost_delay/limit it will
> have no effect for autovacuum.

Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
updated. After the autovacuum launcher reloads the config file, it
calls autovac_balance_cost() that updates that value of active
workers. I'm not sure why we don't update workers' wi_cost_delay,
though.

> I started writing a little helper that could be used to update these
> workerinfo->wi_cost_delay/limit in vacuum_delay_point(),

Since we set vacuum delay parameters for autovacuum workers so that we
ration out I/O equally, I think we should keep the current mechanism
that the autovacuum launcher sets workers' delay parameters and they
update accordingly.

> but I notice
> when they are first set, we consider the autovacuum table options. So,
> I suppose I would need to consider these when updating
> wi_cost_delay/limit later as well? (during vacuum_delay_point() or
> in AutoVacuumUpdateDelay())
>
> I wasn't quite sure because I found these chained ternaries rather
> difficult to interpret, but I think table_recheck_autovac() is saying
> that the autovacuum table options override all other values for
> vac_cost_delay?
>
> vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
> ? avopts->vacuum_cost_delay
> : (autovacuum_vac_cost_delay >= 0)
> ? autovacuum_vac_cost_delay
> : VacuumCostDelay;
>
> i.e. this?
>
> if (avopts && avopts->vacuum_cost_delay >= 0)
> vac_cost_delay = avopts->vacuum_cost_delay;
> else if (autovacuum_vac_cost_delay >= 0)
> vac_cost_delay = autovacuum_vacuum_cost_delay;
> else
> vac_cost_delay = VacuumCostDelay

Yes, if the table has autovacuum table options, we use these values
and the table is excluded from the balancing algorithm I mentioned
above. See the code from table_recheck_autovac(),

/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));

So if the table has autovacuum table options, the vacuum delay
parameters probably should be updated by ALTER TABLE, not by reloading
the config file.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2023-03-02 07:39:14 Re: Generate pg_stat_get_xact*() functions with Macros
Previous Message Michael Paquier 2023-03-02 07:27:39 Re: [BUG] pg_stat_statements and extended query protocol