| From: | Daniil Davydov <3danissimo(at)gmail(dot)com> |
|---|---|
| To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
| Cc: | Sami Imseih <samimseih(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: POC: Parallel processing of indexes in autovacuum |
| Date: | 2026-03-01 14:46:42 |
| Message-ID: | CAJDiXgjTkuqSPerC_nasxDz6d2Komf1ipYKV6SupDRnc9yhO9w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Sat, Feb 28, 2026 at 8:57 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> IIUC earlier patches defined autovacuum_max_parallel_workers with the
> limit by max_worker_processes. Suppose we set:
>
> - max_worker_processes = 8
> - autovacuum_max_parallel_workers = 4
> - max_parallel_workers = 4
>
> If we want to disable all parallel operations, we would need to set
> max_parallel_workers to 0 as well as either
> autovacuum_max_parallel_workers to 0, no? This is because if we set
> only max_parallel_workers to 0, autovacuum workers still can take
> parallel vacuum workers from the max_worker_processes pool. I might be
> missing something though.
>
Even if av_max_parallel_workers is limited by max_worker_processes,
it is enough to set max_parallel_workers to 0 to disable parallel
autovacuum.
When a/v leader wants to create supportive workers, it calls
"RegisterDynamicBackgroundWorker" function, which contain following
logic :
/*
* If this is a parallel worker, check whether there are already too many
* parallel workers; if so, don't register another one.
*/
if (parallel && (BackgroundWorkerData->parallel_register_count -
BackgroundWorkerData->parallel_terminate_count) >=
max_parallel_workers)
{
....
}
Thus, a/v leader cannot launch any workers if max_parallel_workers is set to 0.
> > > If we write the log "%d parallel autovacuum workers have been
> > > released" in AutoVacuumReleaseParallelWorkres(), can we simplify both
> > > tests (4 and 5) further?
> > >
> >
> > It won't help the 4th test, because ReleaseParallelWorkers is called
> > due to both ERROR and shmem_exit, but we want to be sure that
> > workers are released in the try/catch block (i.e. before the shmem_exit).
>
> We already call AutoVacuumReleaseAllParallelWorker() in the PG_CATCH()
> block in do_autovacuum(). If we write the log in
> AutoVacuumReleaseParallelWorkers(), the tap test is able to check the
> log, no?
>
Not quite. Assume that we add "%d workers have been released" log to the
ReleaseAllParallelWorkers. Then we trigger an error for a/v leader and wait
for this log (we are expecting that workers will be released inside the
try/catch block).
Even if there is a bug in the code and a/v leader cannot release parallel
workers due to occured error, one day it will finish vacuuming and call
"proc_exit". During "proc_exit" the "before_shmem_exit_hook" along with
the "ReleaseAllParallelWorkers" will be called.
I.e. we will see the desired log, and we will mistakenly consider this test
passed.
> > Also, I don't know whether the 5th test needs this log at all, because in
> > the end we are checking the number of free parallel workers. If a killed
> > a/v leader doesn't release parallel workers, we'll notice it.
>
> If we can check the log written at process shutdown time, I think we
> can somewhat simplify the test 5 logic by not attaching
> 'autovacuum-start-parallel-vacuum' injection point.
>
> 1. attach 'autovacuum-leader-before-indexes-processing' injection point.
> 2. wait for an av worker to stop at the injection point.
> 3. terminate the av worker.
> 4. verify from the log if the workers have been released.
> 5. disable parallel autovacuum.
> 6. check the free workers (should be 10).
>
> Step 5 and 6 seems to be optional though.
OK, I see your point. But I'm afraid that the "%d released" log can't help
us here for the reason I described above :
"%d released" can be called from several places and we cannot be sure
which one has emitted this log.
I suppose to do the same as we did for try/catch block - add logging inside
the "autovacuum_worker_before_shmem_exit" with some unique message.
Thus, we will be sure that the workers are released precisely in the
"before_shmem_exit_hook".
The alternative is to pass some additional information to the
"ReleaseAllParallelWorkers" function (to supplement the log it emits), but it
doesn't seem like a good solution to me.
**Comments on the 0001 patch**
> + /*
> + * Max number of parallel autovacuum workers. If value is 0 then parallel
> + * degree will computed based on number of indexes.
> + */
> + int autovacuum_parallel_workers;
>
> I'm a bit concerned that the above description doesn't explain what
> number of parallel vacuum workers are used in >0 as it mentioned only
> the maximum number. How about rewording it to:
>
> Target number of parallel autovacuum workers. -1 by default disables
> parallel vacuum during autovacuum. 0 means choose the parallel degree
> based on the number of indexes.
>
I agree.
**Comments on the 0002 patch**
> + PVWorkersUsage workers_usage;
> /* Counters that follow are only for scanned_pages */
> int64 tuples_deleted; /* # deleted from table */
>
> Let's insert a new line between the new line and the existing line.
>
OK
> + if (AmAutoVacuumWorkerProcess())
> + {
> + /* Worker usage stats for parallel autovacuum. */
> + appendStringInfo(&buf,
> + _("parallel workers: index
> vacuum: %d planned, %d reserved, %d launched in total\n"),
> + vacrel->workers_usage.vacuum.nplanned,
> + vacrel->workers_usage.vacuum.nreserved,
> + vacrel->workers_usage.vacuum.nlaunched);
> + }
> + else
> + {
> + /* Worker usage stats for manual VACUUM (PARALLEL). */
> + appendStringInfo(&buf,
> + _("parallel workers: index
> vacuum: %d planned, %d launched in total\n"),
> + vacrel->workers_usage.vacuum.nplanned,
> + vacrel->workers_usage.vacuum.nlaunched);
> + }
> + }
>
> These comments are very obvious so I don't think we need them.
I agree.
> Instead, I think it would be good to explain why we don't need to
> report "reserved" numbers in the manual vacuum cases.
>
I think that we can clarify somewhere why the "reserved" statistic
is collected only for autovacuum. PVWorkersStats is an appropriate
place for it. Thus, there will be no need to write something during
constructing the log.
> ---
> + if (vacrel->workers_usage.vacuum.nplanned > 0)
> + {
> + /* Stats for vacuum phase of index vacuuming. */
>
> and
>
> + if (vacrel->workers_usage.cleanup.nplanned > 0)
> + {
> + /* Stats for cleanup phase of index vacuuming. */
> +
>
> I don't think we need these comments (the second one has a typo
> though) as it's obvious.
>
I agree.
> */
> void
> parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long
> num_table_tuples,
> - int num_index_scans)
> + int num_index_scans, PVWorkersUsage *wusage)
>
> Please add a brief description of wusage to the function comment. We
> can add comments to both parallel_vacuum_bulkldel_all_indexes() and
> parallel_vacuum_cleanup_all_indexes() or only
> parallel_vacuum_process_all_indexes().
>
OK. I think that adding a comment only to the
parallel_vacuum_process_all_indexes will be more appropriate.
(I'm not sure if the comment I came up with looks good, but I couldn't
formulate it better).
>
> PVWorkersUsage is added twice
>
Oops
**Comments on the 0003 patch**
>
> +#define VacCostParamsEquals(params) \
> + (vacuum_cost_delay == (params).cost_delay && \
> + vacuum_cost_limit == (params).cost_limit && \
> + VacuumCostPageDirty == (params).cost_page_dirty && \
> + VacuumCostPageHit == (params).cost_page_hit && \
> + VacuumCostPageMiss == (params).cost_page_miss)
>
> I'm not sure this macro helps reduce lines of code or improve
> readability as it's used only once and it's slightly unnatural to me
> that *Equals macro takes only one argument.
>
I agree, it looks a bit odd. I'll remove it.
Moreover, this shmem state can be updated only by the a/v leader worker,
so I'll allow it to read shared variables without holding a spinlock.
It seems pretty reliable, what do you think?
**Comments on the 0004 patch**
> +#include "commands/vacuum.h"
> +#include "fmgr.h"
> +#include "miscadmin.h"
> +#include "postmaster/autovacuum.h"
> +#include "storage/shmem.h"
> +#include "storage/ipc.h"
> +#include "storage/lwlock.h"
> +#include "utils/builtins.h"
> +#include "utils/injection_point.h"
>
> We can remove some unnecessary header includes. ISTM we need only
> fmgr.h, autovacuum.h, and injection_point.h.
>
Agree, I'll remove unused includes.
> + const char *msg_format =
> + _("Parallel autovacuum worker cost params: cost_limit=%d,
> cost_delay=%g, cost_page_miss=%d, cost_page_dirty=%d,
> cost_page_hit=%d");
> +
>
> I don't think we need the translation for this message as it's not a
> user-facing one.
>
> We don't capitalize the first letter in the error message.
>
I agree.
> ---
> + ereport(DEBUG2,
> + (errmsg("number of free parallel autovacuum workers is set
> to %u due to config reload",
> + AutoVacuumShmem->av_freeParallelWorkers),
> + errhidecontext(true)));
>
> Why do we need to add errhidecontext(true) here?
>
I thought we don't need to write redundant info to the logfile. But I
don't see that other DEBUG2 messages are hiding context, so
I'll remove it.
BTW, do we want to use "elog" here too?
> ---
> + 'tests': [
> + 't/001_basic.pl',
> + ],
>
> Need to be updated to the new filename.
>
> ---
> + * Copyright (c) 2020-2025, PostgreSQL Global Development Group
>
> Please update the copyright years.
>
Yeah, I forgot about it. Will fix it.
Thank you very much for the review!
Please, see the updated set of patches.
--
Best regards,
Daniil Davydov
| Attachment | Content-Type | Size |
|---|---|---|
| v23-0003-Cost-based-parameters-propagation-for-parallel-a.patch | text/x-patch | 9.8 KB |
| v23-0002-Logging-for-parallel-autovacuum.patch | text/x-patch | 10.2 KB |
| v23-0001-Parallel-autovacuum.patch | text/x-patch | 19.4 KB |
| v23-0005-Documentation-for-parallel-autovacuum.patch | text/x-patch | 4.4 KB |
| v23-0004-Tests-for-parallel-autovacuum.patch | text/x-patch | 21.3 KB |
| v22--v23-diff-for-0004.patch | text/x-patch | 7.4 KB |
| v22--v23-diff-for-0003.patch | text/x-patch | 2.3 KB |
| v22--v23-diff-for-0001.patch | text/x-patch | 890 bytes |
| v22--v23-diff-for-0002.patch | text/x-patch | 4.6 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Marcos Pegoraro | 2026-03-01 14:54:27 | Re: Partial Mode in Aggregate Functions |
| Previous Message | David G. Johnston | 2026-03-01 14:29:56 | Re: Partial Mode in Aggregate Functions |