Quick Links

Re: POC: Parallel processing of indexes in autovacuum

From:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To:	Daniil Davydov <3danissimo(at)gmail(dot)com>
Cc:	Sami Imseih <samimseih(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: POC: Parallel processing of indexes in autovacuum
Date:	2026-03-02 22:25:39
Message-ID:	CAD21AoAXMjX03h5K84u0heBLU+fqGgWBGBDwnBDGSs=DhyF9pQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sun, Mar 1, 2026 at 6:46 AM Daniil Davydov <3danissimo(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Sat, Feb 28, 2026 at 8:57 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > IIUC earlier patches defined autovacuum_max_parallel_workers with the
> > limit by max_worker_processes. Suppose we set:
> >
> > - max_worker_processes = 8
> > - autovacuum_max_parallel_workers = 4
> > - max_parallel_workers = 4
> >
> > If we want to disable all parallel operations, we would need to set
> > max_parallel_workers to 0 as well as either
> > autovacuum_max_parallel_workers to 0, no? This is because if we set
> > only max_parallel_workers to 0, autovacuum workers still can take
> > parallel vacuum workers from the max_worker_processes pool. I might be
> > missing something though.
> >
>
> Even if av_max_parallel_workers is limited by max_worker_processes,
> it is enough to set max_parallel_workers to 0 to disable parallel
> autovacuum.
>
> When a/v leader wants to create supportive workers, it calls
> "RegisterDynamicBackgroundWorker" function, which contain following
> logic :
> /*
> * If this is a parallel worker, check whether there are already too many
> * parallel workers; if so, don't register another one.
> */
> if (parallel && (BackgroundWorkerData->parallel_register_count -
> BackgroundWorkerData->parallel_terminate_count) >=
> max_parallel_workers)
> {
> ....
> }
>
> Thus, a/v leader cannot launch any workers if max_parallel_workers is set to 0.

Right. But this fact would actually support that limiting
autovacuum_max_parallel_workers by max_parallel_workers is more
appropriate, no?

>
> > > > If we write the log "%d parallel autovacuum workers have been
> > > > released" in AutoVacuumReleaseParallelWorkres(), can we simplify both
> > > > tests (4 and 5) further?
> > > >
> > >
> > > It won't help the 4th test, because ReleaseParallelWorkers is called
> > > due to both ERROR and shmem_exit, but we want to be sure that
> > > workers are released in the try/catch block (i.e. before the shmem_exit).
> >
> > We already call AutoVacuumReleaseAllParallelWorker() in the PG_CATCH()
> > block in do_autovacuum(). If we write the log in
> > AutoVacuumReleaseParallelWorkers(), the tap test is able to check the
> > log, no?
> >
>
> Not quite. Assume that we add "%d workers have been released" log to the
> ReleaseAllParallelWorkers. Then we trigger an error for a/v leader and wait
> for this log (we are expecting that workers will be released inside the
> try/catch block).
>
> Even if there is a bug in the code and a/v leader cannot release parallel
> workers due to occured error, one day it will finish vacuuming and call
> "proc_exit". During "proc_exit" the "before_shmem_exit_hook" along with
> the "ReleaseAllParallelWorkers" will be called.

What bugs are you concerned about in this case? I'm not sure what you
meant by "a/v leader cannot release parallel workers due to occured
error". It sounds like you mentioned a case where there is a bug in
AutoVacuumReleaseParallelWorkers() but if there is the bug and the
leader failed to release parallel workers, we would end up not writing
these elogs in either case.

>
> > > Also, I don't know whether the 5th test needs this log at all, because in
> > > the end we are checking the number of free parallel workers. If a killed
> > > a/v leader doesn't release parallel workers, we'll notice it.
> >
> > If we can check the log written at process shutdown time, I think we
> > can somewhat simplify the test 5 logic by not attaching
> > 'autovacuum-start-parallel-vacuum' injection point.
> >
> > 1. attach 'autovacuum-leader-before-indexes-processing' injection point.
> > 2. wait for an av worker to stop at the injection point.
> > 3. terminate the av worker.
> > 4. verify from the log if the workers have been released.
> > 5. disable parallel autovacuum.
> > 6. check the free workers (should be 10).
> >
> > Step 5 and 6 seems to be optional though.
>
> OK, I see your point. But I'm afraid that the "%d released" log can't help
> us here for the reason I described above :
> "%d released" can be called from several places and we cannot be sure
> which one has emitted this log.
>
> I suppose to do the same as we did for try/catch block - add logging inside
> the "autovacuum_worker_before_shmem_exit" with some unique message.
> Thus, we will be sure that the workers are released precisely in the
> "before_shmem_exit_hook".
>
> The alternative is to pass some additional information to the
> "ReleaseAllParallelWorkers" function (to supplement the log it emits), but it
> doesn't seem like a good solution to me.

I'm not sure if it's important to check how
AutoVacuumReleaseAllParallelWorkers() has been called (either in
PG_CATCH() block or by autovacuum_worker_before_shmem_exit()). We
would end up having to add a unique message to each caller of
AutoVacuumReleaseAllParallelWorkers() in the future. I guess it's more
important to make sure that all workers have been released in the end.

In that sense, it would make more sense to check that all workers have
actually been released (i.e., checking by
get_parallel_autovacuum_free_workers()) after a parallel vacuum
instead of checking workers being released by debug logs. That is, we
can check at each test end if get_parallel_autovacuum_free_workers()
returns the expected number after disabling parallel autovacuum.

>
> > + if (AmAutoVacuumWorkerProcess())
> > + {
> > + /* Worker usage stats for parallel autovacuum. */
> > + appendStringInfo(&buf,
> > + _("parallel workers: index
> > vacuum: %d planned, %d reserved, %d launched in total\n"),
> > + vacrel->workers_usage.vacuum.nplanned,
> > + vacrel->workers_usage.vacuum.nreserved,
> > + vacrel->workers_usage.vacuum.nlaunched);
> > + }
> > + else
> > + {
> > + /* Worker usage stats for manual VACUUM (PARALLEL). */
> > + appendStringInfo(&buf,
> > + _("parallel workers: index
> > vacuum: %d planned, %d launched in total\n"),
> > + vacrel->workers_usage.vacuum.nplanned,
> > + vacrel->workers_usage.vacuum.nlaunched);
> > + }
> > + }
> >
> > These comments are very obvious so I don't think we need them.
>
> I agree.
>
> > Instead, I think it would be good to explain why we don't need to
> > report "reserved" numbers in the manual vacuum cases.
> >
>
> I think that we can clarify somewhere why the "reserved" statistic
> is collected only for autovacuum. PVWorkersStats is an appropriate
> place for it. Thus, there will be no need to write something during
> constructing the log.

On second thoughts on the "planned" and "reserved", can we consider
what the patch implemented as "reserved" as the "planned" in
autovacuum cases? That is, in autovacuum cases, the "planned" number
considers the number of parallel degrees based on the number of
indexes (or autovacuum_parallel_workers value) as well as the number
of workers that have actually been reserved. In cases of
autovacuum_max_parallel_workers shortage, users would notice by seeing
logs that enough workers are not planned in the first place against
the number of indexes on the table. That might be less confusing for
users rather than introducing a new "reserved" concept in the vacuum
logs. Also, it slightly helps simplify the codes.

>
> **Comments on the 0003 patch**
>
> >
> > +#define VacCostParamsEquals(params) \
> > + (vacuum_cost_delay == (params).cost_delay && \
> > + vacuum_cost_limit == (params).cost_limit && \
> > + VacuumCostPageDirty == (params).cost_page_dirty && \
> > + VacuumCostPageHit == (params).cost_page_hit && \
> > + VacuumCostPageMiss == (params).cost_page_miss)
> >
> > I'm not sure this macro helps reduce lines of code or improve
> > readability as it's used only once and it's slightly unnatural to me
> > that *Equals macro takes only one argument.
> >
>
> I agree, it looks a bit odd. I'll remove it.
> Moreover, this shmem state can be updated only by the a/v leader worker,
> so I'll allow it to read shared variables without holding a spinlock.
> It seems pretty reliable, what do you think?

Right. It's safe for the leader to read these fields without locks.

> > ---
> > + ereport(DEBUG2,
> > + (errmsg("number of free parallel autovacuum workers is set
> > to %u due to config reload",
> > + AutoVacuumShmem->av_freeParallelWorkers),
> > + errhidecontext(true)));
> >
> > Why do we need to add errhidecontext(true) here?
> >
>
> I thought we don't need to write redundant info to the logfile. But I
> don't see that other DEBUG2 messages are hiding context, so
> I'll remove it.
>
> BTW, do we want to use "elog" here too?

Here are some comments:

* 0001 patch:

* of the worker list (see above).
@@ -299,6 +308,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;

We should use int32 instead of uint32.

* 0003 patch:

I've attached the proposed changes to the 0003 patch, which includes:

- removal of VacuumCostParams as it's not necessary.
- comment updates.
- other cosmetic updates.

* 0004 patch:

+#ifdef USE_INJECTION_POINTS
+ /*
+ * If we are parallel autovacuum worker, we can consume delay parameters
+ * during index processing (via vacuum_delay_point call). This logging
+ * allows tests to ensure this.
+ */
+ if (shared->is_autovacuum)
+ elog(DEBUG2,
+ "parallel autovacuum worker cost params: cost_limit=%d,
cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d,
cost_page_hit=%d",
+ vacuum_cost_limit,
+ vacuum_cost_delay,
+ VacuumCostPageMiss,
+ VacuumCostPageDirty,
+ VacuumCostPageHit);
+#endif

While it's true that we use these logs only during the regression
tests that are enabled only when injection points are also enabled,
these logs themselves are not related to the injection points. I'd
recommend writing these logs when the worker refreshes its local delay
parameters (i.e., in parallel_vacuum_update_shared_delay_params()).

---
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 20
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});

It would be better to set log_autovacuum_min_duration = 0 to the
specific table instead of setting globally.

---
+ uint32 nfree_workers;
+
+#ifndef USE_INJECTION_POINTS
+ ereport(ERROR, errmsg("injection points not supported"));
+#endif
+
+ nfree_workers = AutoVacuumGetFreeParallelWorkers();
+
+ PG_RETURN_UINT32(nfree_workers);
+}

As I commented above, I think we should use int32 for the number of
parallel free workers. So let's change it here too.

---
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nfree_workers;
+
+#ifndef USE_INJECTION_POINTS
+ ereport(ERROR, errmsg("injection points not supported"));
+#endif
+

I think we don't necessarily need to check the USE_INJECTION_POINTS in
this function as we already have the check in the tap tests. The
function itself is actually workable even without injection points.

Please update the copyright year here too.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment	Content-Type	Size
change_0003_masahiko.patch	text/x-patch	10.1 KB

In response to

Re: POC: Parallel processing of indexes in autovacuum at 2026-03-01 14:46:42 from Daniil Davydov

Responses

Re: POC: Parallel processing of indexes in autovacuum at 2026-03-04 06:58:49 from Daniil Davydov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Nathan Bossart	2026-03-02 22:41:24	Re: pg_dumpall --roles-only interact with other options
Previous Message	Zsolt Parragi	2026-03-02 22:23:17	Re: pg_dumpall --roles-only interact with other options