| From: | Sami Imseih <samimseih(at)gmail(dot)com> |
|---|---|
| To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
| Cc: | Daniil Davydov <3danissimo(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: POC: Parallel processing of indexes in autovacuum |
| Date: | 2025-11-20 19:31:45 |
| Message-ID: | CAA5RZ0sSXDza7_nUUbhHL_Sws+M+HR1daKJPXHpdLuNCkwUgUg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
I started to review this patch set again, and it needed rebasing, so I
went ahead and did that.
I also have some comments:
#1
In AutoVacuumReserveParallelWorkers()
I think here we should assert:
```
Assert(nworkers <= AutoVacuumShmem->av_freeParallelWorkers);
```
prior to:
```
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
```
We are capping nworkers earlier in parallel_vacuum_compute_workers()
```
/* Cap by GUC variable */
parallel_workers = Min(parallel_workers, max_workers);
```
so the assert will safe-guard against someone making a faulty change
in parallel_vacuum_compute_workers()
#2
In
parallel_vacuum_process_all_indexes()
```
+ /*
+ * Reserve workers in autovacuum global state. Note, that we
may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
```
nworkers has a double meaning. The return value of
AutoVacuumReserveParallelWorkers
is nreserved. I think this should be
```
nreserved = AutoVacuumReserveParallelWorkers(nworkers);
```
and nreserved becomes the authoritative value for the number of parallel
workers after that point.
#3
I noticed in the logging:
```
2025-11-20 18:44:09.252 UTC [36787] LOG: automatic vacuum of table
"test.public.t": index scans: 0
workers usage statistics for all of index scans : launched in
total = 3, planned in total = 3
pages: 0 removed, 503306 remain, 14442 scanned (2.87% of
total), 0 eagerly scanned
tuples: 101622 removed, 7557074 remain, 0 are dead but not yet removable
removable cutoff: 1711, which was 1 XIDs old when operation ended
frozen: 4793 pages from table (0.95% of total) had 98303 tuples frozen
visibility map: 4822 pages set all-visible, 4745 pages set
all-frozen (0 were all-visible)
index scan bypassed: 8884 pages from table (1.77% of total)
have 195512 dead item identifiers
```
that even though index scan was bypased, we still launched parallel
workers. I didn't dig deep into this,
but that looks wrong. what do you think?
#4
instead of:
"workers usage statistics for all of index scans : launched in total =
0, planned in total = 0"
how about:
"parallel index scan : workers planned = 0, workers launched = 0"
also log this after the "index scan needed:" line; so it looks like
this. What do you think>
```
index scan needed: 13211 pages from table (2.63% of total) had
289482 dead item identifiers removed
parallel index scan : workers planned = 0, workers launched = 0
index "t_pkey": pages: 25234 in total, 0 newly deleted, 0 currently
deleted, 0 reusable
index "t_c1_idx": pages: 10219 in total, 0 newly deleted, 0
currently deleted, 0 reusable
```
--
Sami Imseih
Amazon Web Services (AWS)
| Attachment | Content-Type | Size |
|---|---|---|
| v14-0004-Documentation-for-parallel-autovacuum.patch | application/octet-stream | 4.4 KB |
| v14-0003-Tests-for-parallel-autovacuum.patch | application/octet-stream | 19.2 KB |
| v14-0002-Logging-for-parallel-autovacuum.patch | application/octet-stream | 7.7 KB |
| v14-0001-Parallel-autovacuum.patch | application/octet-stream | 19.6 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Masahiko Sawada | 2025-11-20 20:18:08 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
| Previous Message | Greg Burd | 2025-11-20 19:08:57 | Re: Buffer locking is special (hints, checksums, AIO writes) |