Re: Should vacuum process config file reload more often

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Should vacuum process config file reload more often
Date: 2023-04-06 15:52:25
Message-ID: CAAKRu_aH-UbvNCQq-EouNUTjtGexQTRVo7ik_vYR4WUocVny7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 5, 2023 at 11:10 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> On Wed, Apr 5, 2023 at 3:43 PM Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:
> > On Wed, Apr 5, 2023 at 2:56 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > >
> > > + /*
> > > + * Balance and update limit values for autovacuum workers. We must
> > > + * always do this in case the autovacuum launcher or another
> > > + * autovacuum worker has recalculated the number of workers across
> > > + * which we must balance the limit. This is done by the launcher when
> > > + * launching a new worker and by workers before vacuuming each table.
> > > + */
> > >
> > > I don't quite understand what's going on here. A big reason that I'm
> > > worried about this whole issue in the first place is that sometimes
> > > there's a vacuum going on a giant table and you can't get it to go
> > > fast. You want it to absorb new settings, and to do so quickly. I
> > > realize that this is about the number of workers, not the actual cost
> > > limit, so that makes what I'm about to say less important. But ... is
> > > this often enough? Like, the time before we move onto the next table
> > > could be super long. The time before a new worker is launched should
> > > be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
> > > settings, so that's not horrible, but I'm kind of struggling to
> > > understand the rationale for this particular choice. Maybe it's fine.
> >
> > VacuumUpdateCosts() also calls AutoVacuumUpdateCostLimit(), so this will
> > happen if a config reload is pending the next time vacuum_delay_point()
> > is called (which is pretty often -- roughly once per block vacuumed but
> > definitely more than once per table).
> >
> > Relevant code is at the top of vacuum_delay_point():
> >
> > if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
> > {
> > ConfigReloadPending = false;
> > ProcessConfigFile(PGC_SIGHUP);
> > VacuumUpdateCosts();
> > }
> >
>
> Gah, I think I misunderstood you. You are saying that only calling
> AutoVacuumUpdateCostLimit() after napping while vacuuming a table may
> not be enough. The frequency at which the number of workers changes will
> likely be different. This is a good point.
> It's kind of weird to call AutoVacuumUpdateCostLimit() only after napping...

A not fully baked idea for a solution:

Why not keep the balanced limit in the atomic instead of the number of
workers for balance. If we expect all of the workers to have the same
value for cost limit, then why would we just count the workers and not
also do the division and store that in the atomic variable. We are
worried about the division not being done often enough, not the number
of workers being out of date. This solves that, right?

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2023-04-06 16:10:17 Re: zstd compression for pg_dump
Previous Message Julien Rouhaud 2023-04-06 15:40:56 Re: Schema variables - new implementation for Postgres 15