Re: POC: Parallel processing of indexes in autovacuum

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Daniil Davydov <3danissimo(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: POC: Parallel processing of indexes in autovacuum
Date: 2026-01-21 22:22:11
Message-ID: CAA5RZ0vbM_E+C0T475q2j5U1qnUgPBCPQ-3qWYdTxDCRnvE1VQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I took a look at v20-0001,0002 and 0003 and have some comments.

v20-0001:

1/

```
+
+ /*
+ * Cap or increase number of free parallel workers according to the
+ * parameter change.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
+
+ /*
+ * Don't allow number of free workers to become less than zero if the
+ * patameter was decreased.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Max(AutoVacuumShmem->av_freeParallelWorkers, 0);
```

This can probably be simplified to:

```
AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
```

v20-0002:

1/

I don't think showing "reserved" in the logging is needed and could be
confusing.

```
parallel index vacuum/cleanup: 3 workers were planned, 3 workers were
reserved and 3 workers were launched in total
```

Also, even though the table has `autovacuum_parallel_workers = 2`, I
see 3 workers.
This is because one worker was for cleanup due to a gin index on the
table. I think it's better
to show separate lines for index vacuuming and index cleanup, just
like VACUUM VERBOSE.

```
INFO: launched 2 parallel vacuum workers for index vacuuming (planned: 2)
INFO: launched 1 parallel vacuum worker for index cleanup (planned: 1)
```

otherwise it will lead the user to think 3 workers were allocated for
either vacuuming or cleanup.

v20-0003:

1/

inside vacuum_delay_point, I would re-organize the checks to
first run the code block for the a/v worker:

```
if (ConfigReloadPending && AmAutoVacuumWorkerProcess())
```

then the a/v/ parallel worker:

```
if (!AmAutoVacuumWorkerProcess() && IsParallelWorker())
```

But I am also wondering if we should have a specific backend_type
for "autovacuum parallel worker" to differentiate that from the
existing "autovacuum worker".

and also we can have a helper macro like:
```
#define AmAutoVacuumParallelWorkerProcess() (MyBackendType ==
B_AUTOVAC_PARALLEL_WORKER)
```

What do you think?

2/

Add
```
+typedef struct PVSharedCostParams
````

to src/tools/pgindent/typedefs.list

3/

+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.spinlock);
+ pv_shared_cost_params = &(shared->cost_params);

NIT: move SpinLockInit last

4/

Instead of:

```
+ params_generation =
pg_atomic_read_u32(&pv_shared_cost_params->generation);
+
```
and then later on:
````
+ /*
+ * Increase generation of the parameters, i.e. let parallel workers know
+ * that they should re-read shared cost params.
+ */
+ pg_atomic_write_u32(&pv_shared_cost_params->generation,
+ params_generation + 1);
+
+ SpinLockRelease(&pv_shared_cost_params->spinlock);
```

why can't we just do:

pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);

Also, do the pg_atomic_fetch_add_u32 outside of the spinlock. right?

--
Sami Imseih
Amazon Web Services (AWS)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2026-01-21 22:22:19 Re: pgsql: tests: Add a test C++ extension module
Previous Message Tom Lane 2026-01-21 22:07:04 Re: Likely undefined behavior with some flexible arrays