| From: | Daniil Davydov <3danissimo(at)gmail(dot)com> |
|---|---|
| To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
| Cc: | Sami Imseih <samimseih(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: POC: Parallel processing of indexes in autovacuum |
| Date: | 2026-03-19 14:28:57 |
| Message-ID: | CAJDiXgh3Dg2f5k3xRJnzoY39jQENUhh125ArYapXkSu5D7JJuw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Thu, Mar 19, 2026 at 2:49 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> Yes, we already have such a code for PARALLEL option for the VACUUM command.
>
> I guess it's better that autovacuum codes also somewhat follow this
> code for better consistency.
>
I agree. You can find it in the v29-0002 patch.
> > I'm afraid that I can't agree with you here. As I wrote above [1], the
> > parallel a/v feature will be useful when a user has a few huge tables with
> > a big amount of indexes. Only these tables require parallel processing and a
> > user knows about it.
>
> Isn't it a case where users need to increase
> min_parallel_index_scan_size? Suppose that there are two tables that
> are big enough and have enough indexes, it's more natural to me to use
> parallel vacuum for both tables without user manual settings.
>
Do you mean that the user can increase this parameter so that smaller tables
are not considered for the parallel a/v? If so, I don't think it will always
be handy. When I say "smaller tables" I mean that they are small relative to
super huge tables. But actually these "smaller tables" can be pretty big and
require a parallel index scan within parallel queries or VACUUM PARALLEL (not
an autovacuum). Increasing the min_scan_size parameter can decrease
performance of the queries that are relying on the ability to scan indexes
of such tables in parallel. Separated parameter such as
"autovacuum_min_parallel_index_scan_size" could help here, but I don't think
that we want to introduce many new GUC parameters for a single feature.
> > If we implement the feature as you suggested, then after setting the
> > av_max_parallel_workers to N > 0, the user will have to manually disable
> > processing for all tables except the largest ones. This will need to be done
> > to ensure that parallel workers are launched specifically to process the
> > largest tables and not wasting on the processing of little ones.
> >
> > I.e. I'm proposing a design that will require manual actions to *enable*
> > parallel a/v for several large tables rather than *disable* it for all of
> > the rest tables in the cluster. I'm sure that's what users want.
> >
> > Allowing the system to decide which tables to process in parallel is a good
> > way from a design perspective. But I'm thinking of the following example :
> > Imagine that we have a threshold, when exceeded, parallel a/v is used.
> > Several a/v workers encounter tables which exceed this threshold by 1_000 and
> > each of these workers decides to launch a few parallel workers. Another a/v
> > worker encounters a table which is beyond this threshold by 1_000_000 and
> > tries to launch N parallel workers, but facing the max_parallel_workers
> > shortage. Thus, processing of this table will take a very long time to
> > complete due to lack of resources. The only way for users to avoid it is to
> > disable parallel a/v for all tables, which exceeds the threshold and are not
> > of particular interest.
>
> I think the same thing happens even with the current design as long as
> users misconfigure max_parallel_workers, no? Setting
> autovacuum_max_parallel_workers to >0 would mean that users want to
> give additional resources for autovacuums in general, I think it makes
> sense to use parallel vacuum even for tables which exceed the
> threshold by 1000.
>
> Users who want to use parallel autovacuum would have to set
> max_parallel_workers (and max_worker_processes) high enough so that
> each autovacuum worker can use parallel workers. If resource
> contention occurs, it's a sign that the limits are not configured
> properly.
>
Yeah, currently user can misconfigure max_parallel_workers, so (for example)
multiple VACUUM PARALLEL operations running at the same time will face with
a shortage of parallel workers. But I guess that every system has some sane
limit for this parameter's value. If we want to ensure that all a/v leaders
are guaranteed to launch as many parallel workers as required, we might need
to increase the max_parallel_workers too much (and cross the sane limit).
IMHO it may be unacceptable for many systems in production, because it will
undermine the stability.
I don't have direct evidence of my words, so I'll try to get the opinion of
the people who will use the parallel a/v feature in big productions.
> > I'm not sure if this phrase will be understandable to the user.
> > I don't see any places where we would define the "autovacuum operation"
> > concept, so I suppose it could be ambiguous. What about "Maximum number of
> > parallel processes per autovacuuming of one table"?
>
> "autovacuuming of one table" sounds unnatural to me. How about
> "Maximum number of parallel workers that can be used by a single
> autovacuum worker."?
>
It sounds good, I agree.
> >
> > > We check only the server logs throughout the new tap tests. I think we
> > > should also confirm that the autovacuum successfully completes. I've
> > > attached the proposed change to the tap tests.
> > >
> >
> > I agree with proposed changes. BTW, don't we need to reduce the strings
> > length to 80 characters in the tests? In some tests, this rule is followed,
> > and in some it is not.
>
> Yeah, pgperltidy should be run for new tests.
>
OK. I'll do it.
> The 0001 patch looks good to me. I've updated the commit message and
> attached it. I'm going to push the patch, barring any objections.
>
Great news!
> Regarding the documentation changes, I find that the current patch
> needs more explanation at appropriate sections. I think we need to:
>
> 1. describe the new autovacuum_max_parallel_workers GUC parameter (in
> config.sgml)
> 2. describe the new autovacuum_parallel_workers storage parameter (in
> create_table.sgml)
> 3. mention that autovacuum could use parallel vacuum (in maintenance.sgml).
>
I agree.
> I think that part 1 should include the basic explanation of the GUC
> parameter as well as how the number of workers is decided (which could
> be similar to the description for PARALLEL options of the VACUUM
> command).
IMHO, the description of the method for determining the number of parallel
workers will look more appropriate in part 3.
BTW, do we need to mention that this parameter can be overridden by the
per-table setting?
> Part 2 can explain the storage parameter as follow:
>
> Per-table value for <xref linkend="guc-autovacuum-max-parallel-workers"/>
> parameter. If -1 is specified,
> <varname>autovacuum_max_parallel_workers</varname>
> value will be used. The default value is 0.
>
It looks very compact and beautiful, I agree.
Actually, if -1 is specified then we are "choosing the parallel degree based
on the number of indexes". We have several places in the code with such
phrasing. I don't really like it because 1) even if value != -1 we are still
taking the number of indexes into account and 2) basically it is the same as
to say "limited by GUC parameter". I don't want to touch existing comments
in the vacuumparallel.c but in our patch I'd like to say that "GUC parameter's
value will be used". I hope this will not cause any misunderstanding among
readers.
> Part 3 can briefly mention that autovacuum can perform parallel vacuum
> with parallel workers capped by autovacuum_max_parallel_workers as
> follow:
>
> For tables with the <xref linkend="reloption-autovacuum-parallel-workers"/>
> storage parameter set, an autovacuum worker can perform index vacuuming and
> index cleanup with background workers. The number of workers launched by
> a single autovacuum worker is limited by the
> <xref linkend="guc-autovacuum-max-parallel-workers"/>.
I suggest adding here also a description of the method for calculating the
number of parallel workers. If so, I feel that this part of documentation will
be completely the same as in VACUUM PARALLEL (except a few little details).
Maybe we can create some dedicated subchapter in the "Routine vacuuming" where
we describe how the number of parallel workers is decided. Lets call it
something like "24.1.7 Parallel Vacuuming". Both VACUUM PARALLEL and parallel
autovacuum can refer to this subchapter. I think it will be much easier to
maintain. What do you think?
--
Thank you very much for the comments and prepared patch!
Please, see an updated set of patches (I didn't touch patches 0001, 0003 and
0005).
The 0001 patch contains a pretty controversial fix for the
"autovacuum_parallel_workers" description, but I didn't come up with anything
better.
--
Best regards,
Daniil Davydov
| Attachment | Content-Type | Size |
|---|---|---|
| v30-0003-Cost-based-parameters-propagation-for-parallel-a.patch | text/x-patch | 11.0 KB |
| v30-0002-Parallel-autovacuum.patch | text/x-patch | 10.4 KB |
| v30-0005-Documentation-for-parallel-autovacuum.patch | text/x-patch | 4.5 KB |
| v30-0001-Add-parallel-vacuum-worker-usage-to-VACUUM-VERBO.patch | text/x-patch | 9.5 KB |
| v30-0004-Tests-for-parallel-autovacuum.patch | text/x-patch | 11.4 KB |
| v29--v30-diff-for-0004.patch | text/x-patch | 5.7 KB |
| v29--v30-diff-for-0002.patch | text/x-patch | 3.0 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Heikki Linnakangas | 2026-03-19 14:34:11 | Re: Better shared data structure management and resizable shared data structures |
| Previous Message | Chengxi Sun | 2026-03-19 14:14:49 | Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions |