| From: | Sami Imseih <samimseih(at)gmail(dot)com> |
|---|---|
| To: | Daniil Davydov <3danissimo(at)gmail(dot)com> |
| Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: POC: Parallel processing of indexes in autovacuum |
| Date: | 2026-01-21 22:22:11 |
| Message-ID: | CAA5RZ0vbM_E+C0T475q2j5U1qnUgPBCPQ-3qWYdTxDCRnvE1VQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
I took a look at v20-0001,0002 and 0003 and have some comments.
v20-0001:
1/
```
+
+ /*
+ * Cap or increase number of free parallel workers according to the
+ * parameter change.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
+
+ /*
+ * Don't allow number of free workers to become less than zero if the
+ * patameter was decreased.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Max(AutoVacuumShmem->av_freeParallelWorkers, 0);
```
This can probably be simplified to:
```
AutoVacuumShmem->av_freeParallelWorkers = Max(nfree_workers, 0);
```
v20-0002:
1/
I don't think showing "reserved" in the logging is needed and could be
confusing.
```
parallel index vacuum/cleanup: 3 workers were planned, 3 workers were
reserved and 3 workers were launched in total
```
Also, even though the table has `autovacuum_parallel_workers = 2`, I
see 3 workers.
This is because one worker was for cleanup due to a gin index on the
table. I think it's better
to show separate lines for index vacuuming and index cleanup, just
like VACUUM VERBOSE.
```
INFO: launched 2 parallel vacuum workers for index vacuuming (planned: 2)
INFO: launched 1 parallel vacuum worker for index cleanup (planned: 1)
```
otherwise it will lead the user to think 3 workers were allocated for
either vacuuming or cleanup.
v20-0003:
1/
inside vacuum_delay_point, I would re-organize the checks to
first run the code block for the a/v worker:
```
if (ConfigReloadPending && AmAutoVacuumWorkerProcess())
```
then the a/v/ parallel worker:
```
if (!AmAutoVacuumWorkerProcess() && IsParallelWorker())
```
But I am also wondering if we should have a specific backend_type
for "autovacuum parallel worker" to differentiate that from the
existing "autovacuum worker".
and also we can have a helper macro like:
```
#define AmAutoVacuumParallelWorkerProcess() (MyBackendType ==
B_AUTOVAC_PARALLEL_WORKER)
```
What do you think?
2/
Add
```
+typedef struct PVSharedCostParams
````
to src/tools/pgindent/typedefs.list
3/
+ pg_atomic_init_u32(&shared->cost_params.generation, 0);
+ SpinLockInit(&shared->cost_params.spinlock);
+ pv_shared_cost_params = &(shared->cost_params);
NIT: move SpinLockInit last
4/
Instead of:
```
+ params_generation =
pg_atomic_read_u32(&pv_shared_cost_params->generation);
+
```
and then later on:
````
+ /*
+ * Increase generation of the parameters, i.e. let parallel workers know
+ * that they should re-read shared cost params.
+ */
+ pg_atomic_write_u32(&pv_shared_cost_params->generation,
+ params_generation + 1);
+
+ SpinLockRelease(&pv_shared_cost_params->spinlock);
```
why can't we just do:
pg_atomic_fetch_add_u32(&pv_shared_cost_params->generation, 1);
Also, do the pg_atomic_fetch_add_u32 outside of the spinlock. right?
--
Sami Imseih
Amazon Web Services (AWS)
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Smith | 2026-01-21 22:22:19 | Re: pgsql: tests: Add a test C++ extension module |
| Previous Message | Tom Lane | 2026-01-21 22:07:04 | Re: Likely undefined behavior with some flexible arrays |