| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
|---|---|
| To: | Daniil Davydov <3danissimo(at)gmail(dot)com> |
| Cc: | Sami Imseih <samimseih(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: POC: Parallel processing of indexes in autovacuum |
| Date: | 2026-03-18 19:49:17 |
| Message-ID: | CAD21AoDxhN8Z6Lx1ZicBXKkbMsRQqEXiq4ALs4uaD648iSvXoA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Mar 18, 2026 at 2:23 AM Daniil Davydov <3danissimo(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Tue, Mar 17, 2026 at 11:51 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > I find the current behavior of the autovacuum_parallel_workers storage
> > parameter somewhat unintuitive for users. The documentation currently
> > states:
> >
> > + <para>
> > + Sets the maximum number of parallel autovacuum workers that can process
> > + indexes of this table.
> > + The default value is -1, which means no parallel index vacuuming for
> > + this table. If value is 0 then parallel degree will computed based on
> > + number of indexes.
> > + Note that the computed number of workers may not actually be available at
> > + run time. If this occurs, the autovacuum will run with fewer workers
> > + than expected.
> > + </para>
> >
> > It is quite confusing that setting the value to 0 does not actually
> > disable the parallel vacuum. In many other PostgreSQL parameters, 0
> > typically means "off" or "no workers." I think that this parameter
> > should behave as follows:
> >
> > -1: Use the value of autovacuum_max_parallel_workers (GUC) as the
> > limit (fallback).
> > >=0: Use the specified value as the limit, capped by autovacuum_max_parallel_workers. (Specifically, setting this to 0 would disable parallel vacuum for the table).
> >
>
> Actually we have several places in the code where "-1" means disabled and "0"
> means choosing a parallel degree based on the number of indexes. Since this
> is an inner logic, I agree that we should make our parameter more intuitive
> to the user. But this will make the code a bit confusing.
Yes, we already have such a code for PARALLEL option for the VACUUM command:
/*
* Disable parallel vacuum, if user has specified parallel degree
* as zero.
*/
if (nworkers == 0)
params.nworkers = -1;
else
params.nworkers = nworkers;
I guess it's better that autovacuum codes also somewhat follow this
code for better consistency.
>
> > Currently, the patch implements parallel autovacuum as an "opt-in"
> > style. That is, even after setting the GUC to >0, users must manually
> > set the storage parameter for each table. This assumes that users
> > already know exactly which tables need parallel vacuum.
> >
> > However, I believe it would be more intuitive to let the system decide
> > which tables are eligible for parallel vacuum based on index size and
> > count (via min_parallel_index_scan_size, etc.), rather than forcing
> > manual per-table configuration. Therefore, I'm thinking we might want
> > to make it "opt-out" style by default instead:
> >
> > - Set the default value of the storage parameter to -1 (i.e., fallback to GUC).
> > - the default value of the GUC autovacuum_max_parallel_workers at 0.
> >
> > With this configuration:
> >
> > - Parallel autovacuum is disabled by default.
> > - Users can enable it globally by simply setting the GUC to >0.
> > - Users can still disable it for specific tables by setting the
> > storage parameter to 0.
> >
> > What do you think?
>
> I'm afraid that I can't agree with you here. As I wrote above [1], the
> parallel a/v feature will be useful when a user has a few huge tables with
> a big amount of indexes. Only these tables require parallel processing and a
> user knows about it.
Isn't it a case where users need to increase
min_parallel_index_scan_size? Suppose that there are two tables that
are big enough and have enough indexes, it's more natural to me to use
parallel vacuum for both tables without user manual settings.
> If we implement the feature as you suggested, then after setting the
> av_max_parallel_workers to N > 0, the user will have to manually disable
> processing for all tables except the largest ones. This will need to be done
> to ensure that parallel workers are launched specifically to process the
> largest tables and not wasting on the processing of little ones.
>
> I.e. I'm proposing a design that will require manual actions to *enable*
> parallel a/v for several large tables rather than *disable* it for all of
> the rest tables in the cluster. I'm sure that's what users want.
>
> Allowing the system to decide which tables to process in parallel is a good
> way from a design perspective. But I'm thinking of the following example :
> Imagine that we have a threshold, when exceeded, parallel a/v is used.
> Several a/v workers encounter tables which exceed this threshold by 1_000 and
> each of these workers decides to launch a few parallel workers. Another a/v
> worker encounters a table which is beyond this threshold by 1_000_000 and
> tries to launch N parallel workers, but facing the max_parallel_workers
> shortage. Thus, processing of this table will take a very long time to
> complete due to lack of resources. The only way for users to avoid it is to
> disable parallel a/v for all tables, which exceeds the threshold and are not
> of particular interest.
I think the same thing happens even with the current design as long as
users misconfigure max_parallel_workers, no? Setting
autovacuum_max_parallel_workers to >0 would mean that users want to
give additional resources for autovacuums in general, I think it makes
sense to use parallel vacuum even for tables which exceed the
threshold by 1000.
Users who want to use parallel autovacuum would have to set
max_parallel_workers (and max_worker_processes) high enough so that
each autovacuum worker can use parallel workers. If resource
contention occurs, it's a sign that the limits are not configured
properly.
> >
> > +{ name => 'autovacuum_max_parallel_workers', type => 'int', context
> > => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
> > + short_desc => 'Maximum number of parallel workers that a single
> > autovacuum worker can take from bgworkers pool.',
> > + variable => 'autovacuum_max_parallel_workers',
> > + boot_val => '2',
> > + min => '0',
> > + max => 'MAX_BACKENDS',
> > +},
> >
> > How about rephrasing the short description to "Maximum number of
> > parallel processes per autovacuum operation."?
>
> I'm not sure if this phrase will be understandable to the user.
> I don't see any places where we would define the "autovacuum operation"
> concept, so I suppose it could be ambiguous. What about "Maximum number of
> parallel processes per autovacuuming of one table"?
"autovacuuming of one table" sounds unnatural to me. How about
"Maximum number of parallel workers that can be used by a single
autovacuum worker."?
>
> > We check only the server logs throughout the new tap tests. I think we
> > should also confirm that the autovacuum successfully completes. I've
> > attached the proposed change to the tap tests.
> >
>
> I agree with proposed changes. BTW, don't we need to reduce the strings
> length to 80 characters in the tests? In some tests, this rule is followed,
> and in some it is not.
Yeah, pgperltidy should be run for new tests.
> Thank you very much for the review and proposed patches!
> Please, see an updated set of patches. Note that the "logging for autovacuum"
> is considered as the first patch now.
Thank you for updating the patches!
The 0001 patch looks good to me. I've updated the commit message and
attached it. I'm going to push the patch, barring any objections.
While we need more discussion on the above points (opt-in vs.
opt-out), I think that the rest of the patches are getting close.
Regarding the documentation changes, I find that the current patch
needs more explanation at appropriate sections. I think we need to:
1. describe the new autovacuum_max_parallel_workers GUC parameter (in
config.sgml)
2. describe the new autovacuum_parallel_workers storage parameter (in
create_table.sgml)
3. mention that autovacuum could use parallel vacuum (in maintenance.sgml).
I think that part 1 should include the basic explanation of the GUC
parameter as well as how the number of workers is decided (which could
be similar to the description for PARALLEL options of the VACUUM
command). Part 2 can explain the storage parameter as follow:
Per-table value for <xref linkend="guc-autovacuum-max-parallel-workers"/>
parameter. If -1 is specified,
<varname>autovacuum_max_parallel_workers</varname>
value will be used. The default value is 0.
Part 3 can briefly mention that autovacuum can perform parallel vacuum
with parallel workers capped by autovacuum_max_parallel_workers as
follow:
For tables with the <xref linkend="reloption-autovacuum-parallel-workers"/>
storage parameter set, an autovacuum worker can perform index vacuuming and
index cleanup with background workers. The number of workers launched by
a single autovacuum worker is limited by the
<xref linkend="guc-autovacuum-max-parallel-workers"/>.
What do you think?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
| Attachment | Content-Type | Size |
|---|---|---|
| v30-0001-Add-parallel-vacuum-worker-usage-to-VACUUM-VERBO.patch | text/x-patch | 9.5 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2026-03-18 19:52:02 | Re: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE |
| Previous Message | Robert Haas | 2026-03-18 19:30:10 | Re: Better shared data structure management and resizable shared data structures |