Quick Links

Re: POC: Parallel processing of indexes in autovacuum

From:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To:	Daniil Davydov <3danissimo(at)gmail(dot)com>
Cc:	Sami Imseih <samimseih(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: POC: Parallel processing of indexes in autovacuum
Date:	2026-01-21 22:28:24
Message-ID:	CAD21AoD7_4gsQ2a82zO3SaRwjdw_3tyiYDHNFPUKQ5DAA5HOtA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sat, Jan 17, 2026 at 6:52 AM Daniil Davydov <3danissimo(at)gmail(dot)com> wrote:
>
>
>
> > I've attached the patch proposing this change (please find
> > v19-0001_masahiko.patch).
>
> Thank you, I'll apply this patch. A few things in the patch that I changed :
> 1)
> + * The caller must call AutoVacuumReleaseParallelWorkers() to release the...
> I think that we also should mention AutoVacuumReleaseAllParallelWorkers.
> 2)
> + * Similar to AutoVacuumReleaseParallelWorkers(), but this function releases...
> If you don't mind, I'll leave the "same as above" formulation since this is
> typical for the postgres code.
>

Agreed.

>
> > BTW it seems to me that this GUC should be capped by
> > max_parallel_workers instead of max_worker_processes, no?
> >
>
> I explained my point about it here [1] and here [2]. What do you think?

I agree that autovacuum_max_parallel_workers should not be capped by
other GUC parametres when setting a value. However, these messages
seem not explain why this parameter is limited by max_worker_processes
instead of max_parallel_workers. You mentioned:

> I will keep the 'max_worker_processes' limit, so autovacuum will not
> waste time initializing a parallel context if there is no chance that
> the request will succeed.
> But it's worth remembering that actually the
> 'autovacuum_max_parallel_workers' parameter will always be implicitly
> capped by 'max_parallel_workers'.

It doesn't make sense to me that we limit
autovacuum_max_parallel_workers by max_worker_processes TBH. When
users want to have more parallel vacuum workers for autovacuum and the
VACUUM command, they would have to consider max_worker_processes,
max_parallel_workers, and max_parallel_maintenance_workers separately.
Given that max_parallel_workers is controlling the number of
max_worker_processes that can be used in parallel operations, I
believe that parallel vacuum workers for autovacuum should also be
taken from that pool.

>
> > ---
> > + * Note, that this function has no effect if we are non-autovacuum
> > + * parallel worker.
> > + */
> >
> > I don't think this kind of comment should be noted here since if we
> > change the parallel_vacuum_update_shared_delay_params() behavior in
> > the future, such comments would get easily out-of-sync.
> >
>
> If behavior will be changed, then all comments for this function will need to
> be changed, actually. Don't get me wrong - I just think that this Note is
> important for the readers. But if you doubt its usefulness, I don't
> mind deleting it.

I still could not figure out why it should be mentioned here instead
of at the comment of parallel_vacuum_update_shared_delay_params().
Readers can notice that calling
parallel_vacuum_update_shared_delay_params() for parallel vacuum
worker for the VACUUM command has no effect when reading the function.
In my opinion, we should mention here why we call
parallel_vacuum_update_shared_delay_params() but should not mention
what the called function does because it should have been described in
that function.

BTW can we expose pv_shared_cost_params so that we can check it in
vacuum_delay_point() before trying to call
parallel_vacuum_update_shared_delay_params()?

>
> > > > IIUC autovacuum parallel workers seems to update their
> > > > vacuum_cost_{delay|limit} every vacuum_delay_point(), which seems not
> > > > good. Can we somehow avoid unnecessary updates?
> > >
> > > More precisely, parallel worker *reads* leader's parameters every delay_point.
> > > Obviously, this does not mean that the parameters will necessarily be updated.
> > >
> > > But I don't see anything wrong with this logic. We just every time get the most
> > > relevant parameters from the leader. Of course we can introduce some
> > > signaling mechanism, but it will have the same effect as in the current code.
> >
> > Although the parameter propagation itself is working correctly, the
> > current implementation seems suboptimal performance-wise. Acquiring an
> > additional spinlock and updating the local variables for every block
> > seems too costly to me. IIUC we would end up incurring these costs
> > even when vacuum delays are disabled. I think we need to find a better
> > way.
> >
> > For example, we can have a generation of these parameters. That is,
> > the leader increments the generation (stored in PVSharedCostParams)
> > whenever updating them after reloading the configuration file, and
> > workers maintain its generation of the parameters currently used. If
> > the worker's generation < the global generation, it updates its
> > parameters along with its generation. I think we can implement the
> > generation using pg_atomic_u32, making the check for parameter updates
> > lock-free. There might be better ideas, though.
> >
>
> OK, I see your point. Considering that we need to check some shared state (in
> order to understand whether we should update our params), an atomic variable
> seem to be the best solution.
>
>
> Thank you for the review! Please, see v20 patches. Main changes :
> 1) Add new formula for freeParallelWorkers computation
> 2) Add 'nreserved' logging for parallel autovacuum
> 3) Add atomic variable to speed up checking shared params state change
> 4) New test for autovacuum_max_parallel_workers parameter change
> 5) Fully get rid of "custom" injection points in tests
>

The 0001 patch looks mostly good to me except for the above comment
(max_worker_processes vs. max_parallel_workers) and the following
point:

+ nfree_workers =
+ autovacuum_max_parallel_workers - prev_max_parallel_workers +
+ AutoVacuumShmem->av_freeParallelWorkers;
+
+ /*
+ * Cap or increase number of free parallel workers according to the
+ * parameter change.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
+
+ /*
+ * Don't allow number of free workers to become less than zero if the
+ * patameter was decreased.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Max(AutoVacuumShmem->av_freeParallelWorkers, 0);

Why does it do Max(x, 0) twice?

* 0002 patch:

+ if (vacrel->workers_usage.nplanned > 0 &&
+ AmAutoVacuumWorkerProcess())
+ {
+ /* Worker usage stats for parallel autovacuum */
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup: %d
workers were planned, %d workers were reserved and %d workers were
launched in total\n"),
+ vacrel->workers_usage.nplanned,
+ vacrel->workers_usage.nreserved,
+ vacrel->workers_usage.nlaunched);
+ }
+ else if (vacrel->workers_usage.nplanned > 0)
+ {
+ /* Worker usage stats for manual VACUUM (PARALLEL) */
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup: %d
workers were planned and %d workers were launched in total\n"),
+ vacrel->workers_usage.nplanned,
+ vacrel->workers_usage.nlaunched);
+ }

Can we refactoring these codes to:

if (vacrel->workers_usage.nplanned > 0)
{
if (AmAutoVacuumWorkerProcess())
appendStringInfo(...);
else
appendStringInfo(...);

* 0003 patch:

+ if (!AmAutoVacuumWorkerProcess() && IsParallelWorker())
+ {

We can just check IsParallelWorker() here.

---
+extern void parallel_vacuum_update_shared_delay_params(void);
+extern void parallel_vacuum_propagate_cost_based_params(void);

I think it's better to have similar names to these functions for
consistency and readability. How about the following?

parallel_vacuum_update_delay_params();
parallel_vacuum_propagate_delay_params();

---
+
+ params_generation = pg_atomic_read_u32(&pv_shared_cost_params->generation);
+
+ SpinLockAcquire(&pv_shared_cost_params->spinlock);
+
+ pv_shared_cost_params->cost_delay = vacuum_cost_delay;
+ pv_shared_cost_params->cost_limit = vacuum_cost_limit;
+ pv_shared_cost_params->cost_page_dirty = VacuumCostPageDirty;
+ pv_shared_cost_params->cost_page_hit = VacuumCostPageHit;
+ pv_shared_cost_params->cost_page_miss = VacuumCostPageMiss;

I think we can check if the new cost-based delay parameters are really
changed before changing the shared values. If users don't change
cost-based delay parameters, we don't need to increment the generation
at all.

---
+ pg_atomic_write_u32(&pv_shared_cost_params->generation,
+ params_generation + 1);

We can use pg_atomic_add_fetch_u32() instead.

---
+/*
+ * Only autovacuum leader can reload config file. We use this structure in
+ * parallel autovacuum for keeping worker's parameters in sync with leader's
+ * parameters.
+ */
+typedef struct PVSharedCostParams

I'd suggest writing the overall description first (e.g., what the
struct holds and what the function does etc), and then describing the
details and notes etc. For instance, readers might be confused when
reading the first sentence "Only autovacuum leader can reload config
file" as the struct definition is not related to the autovacuum
implementation fact that autovacuum workers can reload the config file
during the work. We would need to mention such detail somewhere in the
comments but I think it should not be the first sentence. How about
rewriting it to something like:

+/*
+ * Struct for cost-based vacuum delay related parameters to share among an
+ * autovacuum worker and its parallel vacuum workers.
+ */

---
+ slock_t spinlock; /* protects all fields below */

It's convention to name 'mutex' as a field name.

---
+static PVSharedCostParams * pv_shared_cost_params = NULL;
+
+/* See comments for structure above for the explanation. */
+static uint32 shared_params_generation_local = 0;

I think it's preferable to move these definitions of static variables
right before the function prototypes.

---
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool am_parallel_autovacuum;

How about renaming it to use_shared_delay_params? I think it conveys
better what the field is used for.

* 0004 patch:

The patch introduces 5 injection points, which seems overkill to me
for implementing the tests. IIUC we can implement the test2 with two
injection points: 'autovacuum-start-parallel-vacuum' (set right before
lazy_scan_heap() call) and
'autovacuum-leader-before-indexes-processing'.

1. stop the autovacuum worker at 'autovacuum-start-parallel-vacuum'.
2. change delay params and reload the conf.
3. let the autovacuum worker process tables (vacuum_delay_point() is
called during the heap scan).
4. stop the autovacuum worker at 'autovacuum-leader-before-indexes-processing'.
5. let parallel workers process indexes (vacuum_delay_point() is
called during index vacuuming).

For test3, I think we can write a DEBUG2 log in
adjust_free_parallel_workers() and check it during the test instead of
introducing the test-only function.

For test4 and test5, we check the number of free workers using
get_parallel_autovacuum_free_workers(). However, since autovacuum
could retry to vacuum the table again, the test could fail.

And here are some general comments and suggestions:

+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;

We need comments to explain what we test with this test file.

---
+ $node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = $test_number;
+ ANALYZE test_autovac;
+ });

Why do we need to execute ANALYZE as well?

---
+ $node->wait_for_log($expected_log);
+ truncate $node->logfile, 0 or die "truncate failed: $!";
+}

Truncating all logs every after test would decrease the debuggability
much. We can pass the offset as the start point to wait for the
contents.

---
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});

We can use generate_series() here. And it's faster to load the data
and then create indexes.

---
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);

Please use pgsql_safe() instead.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Re: POC: Parallel processing of indexes in autovacuum at 2026-01-17 14:52:32 from Daniil Davydov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Sami Imseih	2026-01-21 23:41:13	Re: Flush some statistics within running transactions
Previous Message	Peter Smith	2026-01-21 22:22:19	Re: pgsql: tests: Add a test C++ extension module