Quick Links

Re: POC: Parallel processing of indexes in autovacuum

From:	Daniil Davydov <3danissimo(at)gmail(dot)com>
To:	Sami Imseih <samimseih(at)gmail(dot)com>
Cc:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: POC: Parallel processing of indexes in autovacuum
Date:	2025-11-22 20:13:03
Message-ID:	CAJDiXggrBsbzOisf+Nu8pZkYGrpUZaFbosL1Wbm3kKxzTm4xgw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On Sat, Nov 1, 2025 at 3:03 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Oct 28, 2025 at 6:10 AM Daniil Davydov <3danissimo(at)gmail(dot)com> wrote:
> >
> > I'll allow it for a/v leader. I've also thought about "compute_parallel_delay".
> > The simplest solution that I see is to move cost-based delay parameters to
> > shared state (PVShared) and create some variables such a
> > VacuumSharedCostBalance, so we can use them inside vacuum_delay_point.
> > What do you think about this idea?
>
> I think that we need to somehow have parallel workers use the new
> vacuum delay parameters (e.g., VacuumCostPageHit and
> VacuumCostPageMiss) after the leader reloads the configuration file.
> The leader shares the initial parameters with the parallel workers
> (via DSM) before starting the workers but doesn't propagate the
> updates during the parallel operations. And the worker doesn't reload
> the configuration file.

I'm still working on it.

> Here are some review comments for 0001 patch:
>
> +static void
> +autovacuum_worker_before_shmem_exit(int code, Datum arg)
> +{
> + if (code != 0)
> + AutoVacuumReleaseAllParallelWorkers();
> +}
> +
>
> AutoVacuumReleaseAllParallelWorkers() calls
> AutoVacuumReleaseParallelWorkers() only when av_nworkers_reserved > 0,
> so I think we don't need the condition 'if (code != 0)' here.

Yeah, I wrote it more like a hint for the reader - "we should call
this function only
if the process is exiting due to an error". But actually it is not
necessary condition.

>
> ---
> +extern void AutoVacuumReleaseAllParallelWorkers(void);
>
> There is no caller of this function outside of autovacuum.h.
>

I will fix it.

> ---
> { name => 'autovacuum_max_parallel_workers', type => 'int', context =>
> 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
> short_desc => 'Maximum number of parallel autovacuum workers, that
> can be taken from bgworkers pool.',
> long_desc => 'This parameter is capped by "max_worker_processes"
> (not by "autovacuum_max_workers"!).',
> variable => 'autovacuum_max_parallel_workers',
> boot_val => '0',
> min => '0',
> max => 'MAX_BACKENDS',
> },
>
> Parallel vacuum in autovacuum can be used only when users set the
> autovacuum_parallel_workers storage parameter. How about using the
> default value 2 for autovacuum_max_parallel_workers GUC parameter?
>

Sounds reasonable, +1 for it.

On Fri, Nov 21, 2025 at 2:31 AM Sami Imseih <samimseih(at)gmail(dot)com> wrote:
>
> Hi,
>
> I started to review this patch set again, and it needed rebasing, so I
> went ahead and did that.

Thanks for the review and rebasing the patch!

>
> I also have some comments:
>
> #1
> In AutoVacuumReserveParallelWorkers()
> I think here we should assert:
>
> ```
> Assert(nworkers <= AutoVacuumShmem->av_freeParallelWorkers);
> ```
> prior to:
> ```
> + AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
> ```
>
> We are capping nworkers earlier in parallel_vacuum_compute_workers()
>
> ```
> /* Cap by GUC variable */
> parallel_workers = Min(parallel_workers, max_workers);
> ```
>
> so the assert will safe-guard against someone making a faulty change
> in parallel_vacuum_compute_workers()
>

Hm, I guess it is just a bug. We should reduce av_freeParallelWorkers by the
computed 'nreserved/ value (thus, we don't need any assertion). I'll fix it.

>
> #2
> In
> parallel_vacuum_process_all_indexes()
>
> ```
> + /*
> + * Reserve workers in autovacuum global state. Note, that we
> may be given
> + * fewer workers than we requested.
> + */
> + if (AmAutoVacuumWorkerProcess() && nworkers > 0)
> + nworkers = AutoVacuumReserveParallelWorkers(nworkers);
> ```
>
> nworkers has a double meaning. The return value of
> AutoVacuumReserveParallelWorkers
> is nreserved. I think this should be
>
> ```
> nreserved = AutoVacuumReserveParallelWorkers(nworkers);
> ```
>
> and nreserved becomes the authoritative value for the number of parallel
> workers after that point.

Reserving parallel workers is specific for an autovacuum. If we add
'nreserved' variable, we would have to change all conditions below in
order not to break maintenance parallel vacuum. I think it will be confusing :
***
if (nworkers > 0 || (AmAutoVacuumWorkerProcess() && nreserved > 0))
***

Moreover, 'nworkers' reflects how many workers will be involved in vacuuming,
and I think that capping it by 'nreserved' is not breaking this semantic.

>
> #3
> I noticed in the logging:
>
> ```
> 2025-11-20 18:44:09.252 UTC [36787] LOG: automatic vacuum of table
> "test.public.t": index scans: 0
> workers usage statistics for all of index scans : launched in
> total = 3, planned in total = 3
> pages: 0 removed, 503306 remain, 14442 scanned (2.87% of
> total), 0 eagerly scanned
> tuples: 101622 removed, 7557074 remain, 0 are dead but not yet removable
> removable cutoff: 1711, which was 1 XIDs old when operation ended
> frozen: 4793 pages from table (0.95% of total) had 98303 tuples frozen
> visibility map: 4822 pages set all-visible, 4745 pages set
> all-frozen (0 were all-visible)
> index scan bypassed: 8884 pages from table (1.77% of total)
> have 195512 dead item identifiers
> ```
>
> that even though index scan was bypased, we still launched parallel
> workers. I didn't dig deep into this,
> but that looks wrong. what do you think?
>

We can do both index vacuuming and index cleanup in parallel. I guess that
in your situation the vacuum was bypassed, but cleanup was called.

> #4
> instead of:
>
> "workers usage statistics for all of index scans : launched in total =
> 0, planned in total = 0"
>
> how about:
>
> "parallel index scan : workers planned = 0, workers launched = 0"
>
> also log this after the "index scan needed:" line; so it looks like
> this. What do you think>
>
> ```
> index scan needed: 13211 pages from table (2.63% of total) had
> 289482 dead item identifiers removed
> parallel index scan : workers planned = 0, workers launched = 0
> index "t_pkey": pages: 25234 in total, 0 newly deleted, 0 currently
> deleted, 0 reusable
> index "t_c1_idx": pages: 10219 in total, 0 newly deleted, 0
> currently deleted, 0 reusable
> ```

Agree, it looks better.

Thanks everybody for the comments!
Please, see v15 patches.

--
Best regards,
Daniil Davydov

Attachment	Content-Type	Size
v15-0004-Documentation-for-parallel-autovacuum.patch	text/x-patch	4.4 KB
v15-0002-Logging-for-parallel-autovacuum.patch	text/x-patch	7.7 KB
v15-0003-Tests-for-parallel-autovacuum.patch	text/x-patch	19.2 KB
v15-0001-Parallel-autovacuum.patch	text/x-patch	19.7 KB

In response to

Re: POC: Parallel processing of indexes in autovacuum at 2025-11-20 19:31:45 from Sami Imseih

Responses

Re: POC: Parallel processing of indexes in autovacuum at 2025-11-22 22:51:42 from Sami Imseih

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Nico Williams	2025-11-22 21:29:02	Re: [oauth] SASL mechanisms
Previous Message	Nathan Bossart	2025-11-22 20:03:54	Re: another autovacuum scheduling thread