Re: POC: Parallel processing of indexes in autovacuum

From: Daniil Davydov <3danissimo(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Sami Imseih <samimseih(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: POC: Parallel processing of indexes in autovacuum
Date: 2026-03-25 07:45:47
Message-ID: CAJDiXgi8X-DMb92v5WHLCNxDHxH9gO8WQxOMtdpmU7X=WXCiuQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

> > Yeah, currently user can misconfigure max_parallel_workers, so (for example)
> > multiple VACUUM PARALLEL operations running at the same time will face with
> > a shortage of parallel workers. But I guess that every system has some sane
> > limit for this parameter's value. If we want to ensure that all a/v leaders
> > are guaranteed to launch as many parallel workers as required, we might need
> > to increase the max_parallel_workers too much (and cross the sane limit).
> > IMHO it may be unacceptable for many systems in production, because it will
> > undermine the stability.
>
> I understand the concern that if max_parallel_workers (and/or
> max_worker_processes) value are not high enough to ensure each
> autovacuum workers can launch autovacuum_max_parallel_workers, an
> autovacuum on the very large table might not be able to launch the
> full workers in case where some parallel workers are already being
> used by others (e.g., another autovacuum on a different
> slightly-smaller table etc.). But I'm not sure that the opt-out style
> can handle these cases. Even if there are two huge tables and users
> set parallel_vacuum_workers to both tables, there is no guarantee that
> autovacuums on these tables can use the full workers, as long as
> max_parallel_workers value is not enough.
>

I guess you mean the "opt-in" style here?

Sure, even opt-in style doesn't give us an unbreakable guarantee that huge
tables will be processed with the desired number of parallel workers. But IMHO
"opt-in" greatly increases the probability of this. Searching for arguments in
favor of opt-in style, I asked for help from another person who has been
managing the setup of highload systems for decades. He promised to share his
opinion next week.

> >
> > BTW, do we need to mention that this parameter can be overridden by the
> > per-table setting?
>
> IIUC the per-table setting is not actually overwriting the GUC
> parameter value, but it works as an additional cap. For instance, if
> autovacuum_max_parallel_workers is 2 and autovacuum_parallel_workers
> is 5, we cap the parallel degree by 2, which is a similar behavior to
> other parallel operations such as the parallel_workers storage
> parameter. BTW it actually works in a somewhat different way than
> other autovacuum-related storage parameters; the per-table parameters
> overwrite GUC values. I decided to use the former behavior because
> autovacuum_max_parallel_workers can work as a global switch to disable
> all parallel autovacuum behavior on the system.
>

Yep, you are right. I am misworded. Let me reformulate my question :
Do we need to mention that this parameter can be capped by the per-table
setting?

>
> > > Part 3 can briefly mention that autovacuum can perform parallel vacuum
> > > with parallel workers capped by autovacuum_max_parallel_workers as
> > > follow:
> > >
> > > For tables with the <xref linkend="reloption-autovacuum-parallel-workers"/>
> > > storage parameter set, an autovacuum worker can perform index vacuuming and
> > > index cleanup with background workers. The number of workers launched by
> > > a single autovacuum worker is limited by the
> > > <xref linkend="guc-autovacuum-max-parallel-workers"/>.
> >
> > I suggest adding here also a description of the method for calculating the
> > number of parallel workers. If so, I feel that this part of documentation will
> > be completely the same as in VACUUM PARALLEL (except a few little details).
> > Maybe we can create some dedicated subchapter in the "Routine vacuuming" where
> > we describe how the number of parallel workers is decided. Lets call it
> > something like "24.1.7 Parallel Vacuuming". Both VACUUM PARALLEL and parallel
> > autovacuum can refer to this subchapter. I think it will be much easier to
> > maintain. What do you think?
>
> Describing the parallel vacuum in a new chapter in section 24.1 sounds
> like a good idea.

OK, then I'll do it.

--
Best regards,
Daniil Davydov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2026-03-25 07:46:35 Re: Proposal to allow setting cursor options on Portals
Previous Message Richard Guo 2026-03-25 07:43:11 Re: Convert ALL SubLinks to ANY SubLinks