Re: POC: Parallel processing of indexes in autovacuum

From: Daniil Davydov <3danissimo(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Sami Imseih <samimseih(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: POC: Parallel processing of indexes in autovacuum
Date: 2025-06-18 08:03:10
Message-ID: CAJDiXghmsbTmnm--9B5bbuZXa1OL7SZ0HYppX3tx9XsdwfJBhA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wed, Jun 18, 2025 at 5:37 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Sun, May 25, 2025 at 10:22 AM Daniil Davydov <3danissimo(at)gmail(dot)com> wrote:
> >
> > Thanks everybody for feedback! I attach a v4 patch to this letter.
> > Main features :
> > 1) 'parallel_autovacuum_workers' reloption - integer value, that sets
> > the maximum number of parallel a/v workers that can be taken from
> > bgworkers pool in order to process this table.
> > 2) 'max_parallel_autovacuum_workers' - GUC variable, that sets the
> > maximum total number of parallel a/v workers, that can be taken from
> > bgworkers pool.
> > 3) Parallel autovacuum does not try to use thresholds like
> > NUM_INDEXES_PER_PARALLEL_WORKER and AV_PARALLEL_DEADTUP_THRESHOLD.
> > 4) Parallel autovacuum now can report statistics like "planned vs. launched".
> > 5) For now I got rid of the 'reserving' idea, so now autovacuum
> > leaders are competing with everyone for parallel workers from the
> > bgworkers pool.
> >
> > What do you think about this implementation?
> >
>
> I think it basically makes sense to me. A few comments:
>
> ---
> The patch implements max_parallel_autovacuum_workers as a
> PGC_POSTMASTER parameter but can we make it PGC_SIGHUP? I think we
> don't necessarily need to make it a PGC_POSTMATER since it actually
> doesn't affect how much shared memory we need to allocate.
>

Yep, there's nothing stopping us from doing that. This is a usable
feature, I'll implement it in the v5 patch.

> ---
> I think it's better to have the prefix "autovacuum" for the new GUC
> parameter for better consistency with other autovacuum-related GUC
> parameters.
>
> ---
> #include "storage/spin.h"
> @@ -514,6 +515,11 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
> {
> WaitForParallelWorkersToFinish(pcxt);
> WaitForParallelWorkersToExit(pcxt);
> +
> + /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> + if (AmAutoVacuumWorkerProcess())
> + ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
> +
> pcxt->nworkers_launched = 0;
> if (pcxt->known_attached_workers)
> {
> @@ -1002,6 +1008,11 @@ DestroyParallelContext(ParallelContext *pcxt)
> */
> HOLD_INTERRUPTS();
> WaitForParallelWorkersToExit(pcxt);
> +
> + /* Release all launched (i.e. reserved) parallel autovacuum workers. */
> + if (AmAutoVacuumWorkerProcess())
> + ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
> +
> RESUME_INTERRUPTS();
>
> I think that it's better to release workers in vacuumparallel.c rather
> than parallel.c.
>

Agree with both comments.

Thanks for the review! Please, see v5 patch :
1) GUC variable and field in autovacuum shmem are renamed
2) ParallelAutoVacuumReleaseWorkers call moved from parallel.c to
vacuumparallel.c
3) max_parallel_autovacuum_workers is now PGC_SIGHUP parameter
4) Fix little bug (ParallelAutoVacuumReleaseWorkers in autovacuum.c:735)

--
Best regards,
Daniil Davydov

Attachment Content-Type Size
v5-0002-Sandbox-for-parallel-index-autovacuum.patch text/x-patch 8.6 KB
v5-0001-Parallel-index-autovacuum-with-bgworkers.patch text/x-patch 23.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Frédéric Yhuel 2025-06-18 08:30:57 Re: [BUG] temporary file usage report with extended protocol and unnamed portals
Previous Message Jelte Fennema-Nio 2025-06-18 07:35:34 Re: minimum Meson version