Re: POC: Parallel processing of indexes in autovacuum

From: Daniil Davydov <3danissimo(at)gmail(dot)com>
To: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Sami Imseih <samimseih(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: POC: Parallel processing of indexes in autovacuum
Date: 2026-03-30 08:44:22
Message-ID: CAJDiXgi73x7h0=UoXriFjskRB6htZ-uqXKqvWN3RefuxbP93gA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, Mar 30, 2026 at 7:17 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram(at)gmail(dot)com> wrote:
>
> Thank you for working on this, very useful feature. Sharing a few thoughts:
>
> 1. Shouldn't we also cap by max_parallel_workers to avoid wasting DSM resources in parallel_vacuum_compute_workers?

Actually, autovacuum_max_parallel_workers is already limited by
max_parallel_workers. It is not clear for me why we allow setting this GUC
higher than max_parallel_workers, but if this happens, I think it is a user's
misconfiguration.

> 2. Is it intentional that other autovacuum workers not yield cost limits to the parallel auto vacuum workers? Cost limits are distributed first equally to the autovacuum workers.
> and then they share that. Therefore, parallel workers will be heavily throttled. IIUC, this problem doesn't exist with manual vacuum.
> If we don't fix this, at least we should document this.

Parallel a/v workers inherit cost based parameters (including the
vacuum_cost_limit) from the leader worker. Do you mean that this can be too
low value for parallel operation? If so, user can manually increase the
vacuum_cost_limit reloption for those tables, where parallel a/v sleeps too
much (due to cost delay).

BTW, describing the cost limit propagation to the parallel a/v workers is
worth mentioning in the documentation. I'll add it in the next patch version.

> 3. Additionally, is there a point where, based on the cost limits, launching additional workers becomes counterproductive compared to running fewer workers and preventing it?

I don't think that we can possibly find a universal limit that will be
appropriate for all possible configurations. By now we are using a pretty
simple formula for parallel degree calculation. Since user have several ways
to affect this formula, I guess that there will be no problems with it (except
my concerns about opt-out style).

> 4. Would it make sense to add a table level override to disable parallelism or set parallel worker count?

We already have the "autovacuum_parallel_workers" reloption that is used as
an additional limit for the number of parallel workers. In particular, this
reloption can be used to disable parallelism at all.

>
> I ran some perf tests to show the improvements with parallel vacuum and shared below.

Thank you very much!

> Observations:
>
> 1. Parallel autovacuum provides consistent speedup. With cost_limit=200 and
> 7 workers, vacuum completes 1.41x faster (71s -> 50s). With cost_limit=60,
> the speedup is 1.25x (194s -> 154s).
> 2. I see the benefit comes from parallelizing index vacuum. With 8 indexes totaling
> ~530 MB, parallel workers scan indexes concurrently instead of the leader
> scanning them one by one. The leader's CPU user time drops from ~3s to
> ~0.8s as index work is offloaded
>

1.41 speedup with 7 parallel workers may not seem like a great win, but it is
a whole time of autovacuum operation (not only index bulkdel/cleanup) with
pretty small indexes.

May I ask you to run the same test with a higher table's size (several dozen
gigabytes)? I think the results will be more "expressive".

--
Best regards,
Daniil Davydov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2026-03-30 09:13:48 Re: Skipping schema changes in publication
Previous Message Yura Sokolov 2026-03-30 08:25:55 Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt