Re: Should vacuum process config file reload more often

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Should vacuum process config file reload more often
Date: 2023-03-02 01:41:14
Message-ID: CAAKRu_bR2VrTUskR24GkXV15SxPaCiHYD=soOuWZcT-rGDcA8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 27, 2023 at 9:12 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Fri, Feb 24, 2023 at 7:08 AM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> > Users may wish to speed up long-running vacuum of a large table by
> > decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
> > config file is only reloaded between tables (for autovacuum) or after
> > the statement (for explicit vacuum). This has been brought up for
> > autovacuum in [1].
> >
> > Andres suggested that it might be possible to check ConfigReloadPending
> > in vacuum_delay_point(), so I thought I would draft a rough patch and
> > start a discussion.
>
> In vacuum_delay_point(), we need to update VacuumCostActive too if necessary.

Yes, good point. Thank you!

On Thu, Feb 23, 2023 at 5:08 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> I don't think I can set and leave vac_in_outer_xact the way I am doing
> it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
> which I believe is reachable from codepaths that would not have called
> vacuum(). It seems that if a backend sets it, the outer transaction
> commits, and then the backend ends up calling vacuum_delay_point() in a
> different way later, it wouldn't be quite right.

Perhaps I could just set in_outer_xact to false in the PG_FINALLY()
section in vacuum() to avoid this problem.

On Wed, Mar 1, 2023 at 7:15 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:
> > On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:
> > > > Also, I'm concerned that allowing to change any GUC parameters during
> > > > vacuum/analyze could be a foot-gun in the future. When modifying
> > > > vacuum/analyze-related codes, we have to consider the case where any GUC
> > > > parameters could be changed during vacuum/analyze.
> > >
> > > What kind of scenario are you thinking of?
> >
> > For example, I guess we will need to take care of changes of
> > maintenance_work_mem. Currently we initialize the dead tuple space at
> > the beginning of lazy vacuum, but perhaps we would need to
> > enlarge/shrink it based on the new value?
>
> I don't think we need to do anything about that initially, just because the
> config can be changed in a more granular way, doesn't mean we have to react to
> every change for the current operation.

Perhaps we can mention in the docs that a change to maintenance_work_mem
will not take effect in the middle of vacuuming a table. But, Ithink it probably
isn't needed.

On another topic, I've just realized that when autovacuuming we only
update tab->at_vacuum_cost_delay/limit from
autovacuum_vacuum_cost_delay/limit for each table (in
table_recheck_autovac()) and then use that to update
MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
So, even if we reload the config file in vacuum_delay_point(), if we
don't use the new value of autovacuum_vacuum_cost_delay/limit it will
have no effect for autovacuum.

I started writing a little helper that could be used to update these
workerinfo->wi_cost_delay/limit in vacuum_delay_point(), but I notice
when they are first set, we consider the autovacuum table options. So,
I suppose I would need to consider these when updating
wi_cost_delay/limit later as well? (during vacuum_delay_point() or
in AutoVacuumUpdateDelay())

I wasn't quite sure because I found these chained ternaries rather
difficult to interpret, but I think table_recheck_autovac() is saying
that the autovacuum table options override all other values for
vac_cost_delay?

vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
? avopts->vacuum_cost_delay
: (autovacuum_vac_cost_delay >= 0)
? autovacuum_vac_cost_delay
: VacuumCostDelay;

i.e. this?

if (avopts && avopts->vacuum_cost_delay >= 0)
vac_cost_delay = avopts->vacuum_cost_delay;
else if (autovacuum_vac_cost_delay >= 0)
vac_cost_delay = autovacuum_vacuum_cost_delay;
else
vac_cost_delay = VacuumCostDelay

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2023-03-02 01:57:09 Re: Allow logical replication to copy tables in binary format
Previous Message Tom Lane 2023-03-02 01:40:18 Re: typedef struct LogicalDecodingContext