Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers
Date: 2020-09-21 03:48:24
Message-ID: CAA4eK1+gD7jAP4wqx8+wNhqpc8cM_7o2WvVBa0OVXLsgoDFHqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Sep 20, 2020 at 7:15 PM Peter Eisentraut
<peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
>
> On 2020-09-19 13:24, Amit Kapila wrote:
> >> I think the implemented behavior is wrong.
> >
> > It is the same as what we do for other parallel operations, for
> > example, we limit the number of parallel workers for parallel create
> > index by 'max_parallel_maintenance_workers' and parallel scan
> > operations are limited by 'max_parallel_workers_per_gather'.
>
> But in those cases we don't provide user-visible options to specify a
> per-command setting, so it's not the same thing, is it?
>

Not exactly but there also we have a way for the user to set the value
(using 'parallel_workers' during Create Table or Alter Table) which
will guide the parallel scans.

> >> The VACUUM PARALLEL option
> >> should override the max_parallel_maintenance_worker setting.
> >>
> >> Otherwise, what's the point of the command option?
> >
> > It is for the cases where the user has a better idea of workload. We
> > can launch only a limited number of parallel workers
> > 'max_parallel_workers' in the system, so sometimes users would like to
> > use it as per their requirement.
>
> Right, but my point is, it doesn't actually do that correctly. I can't
> just say, oh, I have a maintenance window, I'd like to run a really fast
> VACUUM. The PARALLEL option is capped by the setting you'd normally use
> anyway, so specifying it is useless.
>

Yeah, because by default we choose the maximum number of possible
workers for Vacuum.

> The only thing it can do right now is if you want to run a manual VACUUM
> less parallel than by default. But I don't see how that is often useful.
>

Say when indexes that support parallel scan are not very big then we
don't need the default behavior because it will use more resources
while providing not much additional benefit.

What according to you should be the behavior here and how will it be
better than current?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-09-21 04:22:16 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Bharath Rupireddy 2020-09-21 03:44:30 Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away