Re: Should vacuum process config file reload more often

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Should vacuum process config file reload more often
Date: 2023-03-07 05:09:54
Message-ID: CAD21AoAB-jtKvYd-QbnhbanoY+-PDCXvrX-JhBUqGFrB6e7eTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> >
> > On Thu, Mar 2, 2023 at 2:36 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
> > > <melanieplageman(at)gmail(dot)com> wrote:
> > > > On another topic, I've just realized that when autovacuuming we only
> > > > update tab->at_vacuum_cost_delay/limit from
> > > > autovacuum_vacuum_cost_delay/limit for each table (in
> > > > table_recheck_autovac()) and then use that to update
> > > > MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
> > > > what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
> > > > So, even if we reload the config file in vacuum_delay_point(), if we
> > > > don't use the new value of autovacuum_vacuum_cost_delay/limit it will
> > > > have no effect for autovacuum.
> > >
> > > Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
> > > updated. After the autovacuum launcher reloads the config file, it
> > > calls autovac_balance_cost() that updates that value of active
> > > workers. I'm not sure why we don't update workers' wi_cost_delay,
> > > though.
> >
> > Ah yes, I didn't realize this. Thanks. I went back and did more code
> > reading/analysis, and I see no reason why we shouldn't update
> > worker->wi_cost_delay to the new value of autovacuum_vac_cost_delay in
> > autovac_balance_cost(). Then, as you said, the autovac launcher will
> > call autovac_balance_cost() when it reloads the configuration file.
> > Then, the next time the autovac worker calls AutoVacuumUpdateDelay(), it
> > will update VacuumCostDelay.
> >
> > > > I started writing a little helper that could be used to update these
> > > > workerinfo->wi_cost_delay/limit in vacuum_delay_point(),
> > >
> > > Since we set vacuum delay parameters for autovacuum workers so that we
> > > ration out I/O equally, I think we should keep the current mechanism
> > > that the autovacuum launcher sets workers' delay parameters and they
> > > update accordingly.
> >
> > Yes, agreed, it should go in the same place as where we update
> > wi_cost_limit (autovac_balance_cost()). I think we should potentially
> > rename autovac_balance_cost() because its name and all its comments
> > point to its only purpose being to balance the total of the workers
> > wi_cost_limits to no more than autovacuum_vacuum_cost_limit. And the
> > autovacuum_vacuum_cost_delay doesn't need to be balanced in this way.
> >
> > Though, since this change on its own would make autovacuum pick up new
> > values of autovacuum_vacuum_cost_limit (without having the worker reload
> > the config file), I wonder if it makes sense to try and have
> > vacuum_delay_point() only reload the config file if it is an explicit
> > vacuum or an analyze not being run in an outer transaction (to avoid
> > overhead of reloading config file)?
> >
> > The lifecycle of this different vacuum delay-related gucs and how it
> > differs between autovacuum workers and explicit vacuum is quite tangled
> > already, though.
>
> So, I've attached a new version of the patch which is quite different
> from the previous versions.

Thank you for updating the patch!

>
> In this version I've removed wi_cost_delay from WorkerInfoData. There is
> no synchronization of cost_delay amongst workers, so there is no reason
> to keep it in shared memory.
>
> One consequence of not updating VacuumCostDelay from wi_cost_delay is
> that we have to have a way to keep track of whether or not autovacuum
> table options are in use.
>
> This patch does this in a cringeworthy way. I added two global
> variables, one to track whether or not cost delay table options are in
> use and the other to store the value of the table option cost delay. I
> didn't want to use a single variable with a special value to indicate
> that table option cost delay is in use because
> autovacuum_vacuum_cost_delay already has special values that mean
> certain things. My code needs a better solution.

While it's true that wi_cost_delay doesn't need to be shared, it seems
to make the logic somewhat complex. We need to handle cost_delay in a
different way from other vacuum-related parameters and we need to make
sure av[_use]_table_option_cost_delay are set properly. Removing
wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
autovacuum worker but it might be worth considering to keep
wi_cost_delay for simplicity.

---
void
AutoVacuumUpdateDelay(void)
{
- if (MyWorkerInfo)
+ /*
+ * We are using autovacuum-related GUCs to update
VacuumCostDelay, so we
+ * only want autovacuum workers and autovacuum launcher to do this.
+ */
+ if (!(am_autovacuum_worker || am_autovacuum_launcher))
+ return;

Is there any case where the autovacuum launcher calls
AutoVacuumUpdateDelay() function?

---
In at autovac_balance_cost(), we have,

int vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
autovacuum_vac_cost_limit : VacuumCostLimit);
double vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay);
:
/* not set? nothing to do */
if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
return;

IIUC if autovacuum_vac_cost_delay is changed to 0 during autovacuums
running, their vacuum delay parameters are not changed. It's not a bug
of the patch but I think we can fix it in this patch.

>
> It is worth mentioning that I think that in master,
> AutoVacuumUpdateDelay() was incorrectly reading wi_cost_limit and
> wi_cost_delay from shared memory without holding a lock.

Indeed.

> I've added in a shared lock for reading from wi_cost_limit in this
> patch. However, AutoVacuumUpdateLimit() is called unconditionally in
> vacuum_delay_point(), which is called quite often (per block-ish), so I
> was trying to think if there is a way we could avoid having to check
> this shared memory variable on every call to vacuum_delay_point().
> Rebalances shouldn't happen very often (done by the launcher when a new
> worker is launched and by workers between vacuuming tables). Maybe we
> can read from it less frequently?

Yeah, acquiring the lwlock for every call to vacuum_delay_point()
seems to be harmful. One idea would be to have one sig_atomic_t
variable in WorkerInfoData and autovac_balance_cost() set it to true
after rebalancing the worker's cost-limit. The worker can check it
without locking and update its delay parameters if the flag is true.

>
> Also not sure how the patch interacts with failsafe autovac and parallel
> vacuum.

Good point.

When entering the failsafe mode, we disable the vacuum delays (see
lazy_check_wraparound_failsafe()). We need to keep disabling the
vacuum delays even after reloading the config file. One idea is to
have another global variable indicating we're in the failsafe mode.
vacuum_delay_point() doesn't update VacuumCostActive if the flag is
true.

As far as I can see we don't need special treatments for parallel
vacuum cases since it works only in manual vacuum. It calculates the
sleep time based on the shared cost balance and how much the worker
did I/O but the basic mechanism is the same as non-parallel case.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-03-07 05:15:27 Re: NumericShort vs NumericLong format
Previous Message Michael Paquier 2023-03-07 05:00:50 Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?