Re: POC: Parallel processing of indexes in autovacuum

From: Daniil Davydov <3danissimo(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Sami Imseih <samimseih(at)gmail(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: POC: Parallel processing of indexes in autovacuum
Date: 2025-10-28 13:09:59
Message-ID: CAJDiXghP2kXnEz+cj3rAWNM3NdKSB_4WtnngFXpVz2omPhGr5A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Tue, Sep 16, 2025 at 1:50 AM Alexander Korotkov
<aekorotkov(at)gmail(dot)com> wrote:
>
> I've rebased this patchset to the current master.
> That required me to move the new GUC definition to guc_parameters.dat.
> Also, I adjusted typedefs.list and made pgindent.

Thank you for looking into it!

>
> + {
> + {
> + "autovacuum_parallel_workers",
> + "Maximum number of parallel autovacuum workers that can be used for processing this table.",
> + RELOPT_KIND_HEAP,
> + ShareUpdateExclusiveLock
> + },
> + -1, -1, 1024
> + },
>
> Should we use MAX_PARALLEL_WORKER_LIMIT instead of hard-coded 1024 here?

I'm afraid that we will have to include an additional header file to do this.
As far as I know, we are trying not to do so. For now, I will leave it
hardcoded.

>
> - * Support routines for parallel vacuum execution.
> + * Support routines for parallel vacuum and autovacuum execution. In the
> + * future comments, the word "vacuum" will refer to both vacuum and
> + * autovacuum.
>
> Not sure about the usage of word "future" here.
> It doesn't look clear what it means.
> Could we use "below" or "within this file"?

Agree, fixed.

> I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block.
> As I heard, the overhead of setting/doing jumps is platform-dependent, and
> not harmless on some platforms. Therefore, can we skip TRY/CATCH block
> for non-autovacuum vacuum? Possibly we could move it to AutoVacWorkerMain(),
> that would save us from repeatedly setting a jump in autovacuum workers too.

Good idea. I found try/catch block inside the "do_autovacuum" function that is
obviously called only inside the autovacuum. I decided to move ReleaseAllWorkers
call there.

>
> In general, I think this patchset badly lack of testing. I think it needs tap tests
> checking from the logs that autovacuum has been done in parallel. Also, it
> would be good to set up some injection points, and check that reserved
> autovacuum parallel workers are getting released correctly in the case of errors.

Some time ago I tried to write a test, but it looked very ugly. Your
idea with injection points
helped me to write much more reliable tests - see it in a new (v12)
pack of patches.

On Wed, Sep 17, 2025 at 1:31 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Sep 15, 2025 at 11:50 AM Alexander Korotkov
> <aekorotkov(at)gmail(dot)com> wrote:
> >
> > I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block. As I heard,
> > the overhead of setting/doing jumps is platform-dependent, and not harmless on some
> > platforms. Therefore, can we skip TRY/CATCH block for non-autovacuum vacuum?
> > Possibly we could move it to AutoVacWorkerMain(), that would save us from repeatedly
> > setting a jump in autovacuum workers too.
>
> I wonder if using the TRY/CATCH block is not enough to ensure that
> autovacuum workers release the reserved parallel workers in FATAL
> cases.
>

That's true. I'll register "before_shmem_exit" callback for autovacuum,
which will release workers if there are any reserved and if the a/v workers
exits abnormally.

>
> IIUC the patch still has one problem in terms of reloading the
> configuration parameters during parallel mode as I mentioned
> before[1].
>

Yep. I was happy to see that you think that config file processing is OK for
autovacuum :)
I'll allow it for a/v leader. I've also thought about "compute_parallel_delay".
The simplest solution that I see is to move cost-based delay parameters to
shared state (PVShared) and create some variables such a
VacuumSharedCostBalance, so we can use them inside vacuum_delay_point.
What do you think about this idea?

Another approaches like a "tell parallel workers that they should
reload config"
looks a bit too invasive IMO.

Thanks everybody for the review! Please, see v12 patches :
1) Implement tests for parallel autovacuum
2) Fix error with unreleased workers - see try/catch block in do_autovacuum
and before_shmem_exit callback registration in AutoVacWorkerMain
3) Allow a/v leader to process config file (see guc.c)

--
Best regards,
Daniil Davydov

Attachment Content-Type Size
v12-0004-Documentation-for-parallel-autovacuum.patch text/x-patch 4.4 KB
v12-0003-Tests-for-parallel-autovacuum.patch text/x-patch 18.4 KB
v12-0002-Logging-for-parallel-autovacuum.patch text/x-patch 7.7 KB
v12-0001-Parallel-index-autovacuum.patch text/x-patch 19.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-10-28 13:10:08 Re: Consistently use the XLogRecPtrIsInvalid() macro
Previous Message Michael Paquier 2025-10-28 13:04:20 Re: Add wal_fpi_bytes_[un]compressed to pg_stat_wal