From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <langote_amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] Block level parallel vacuum |
Date: | 2019-12-18 10:06:05 |
Message-ID: | CAA4eK1Jbc_jx725=h+W5-+ToirCBP2hpWG9fAsRMDqG+E9ORcA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Dec 18, 2019 at 12:04 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Dec 18, 2019 at 11:46 AM Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> >
> > On Wed, 18 Dec 2019 at 15:03, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > I was analyzing your changes related to ReinitializeParallelDSM() and
> > > it seems like we might launch more number of workers for the
> > > bulkdelete phase. While creating a parallel context, we used the
> > > maximum of "workers required for bulkdelete phase" and "workers
> > > required for cleanup", but now if the number of workers required in
> > > bulkdelete phase is lesser than a cleanup phase(as mentioned by you in
> > > one example), then we would launch more workers for bulkdelete phase.
> >
> > Good catch. Currently when creating a parallel context the number of
> > workers passed to CreateParallelContext() is set not only to
> > pcxt->nworkers but also pcxt->nworkers_to_launch. We would need to
> > specify the number of workers actually to launch after created the
> > parallel context or when creating it. Or I think we call
> > ReinitializeParallelDSM() even the first time running index vacuum.
> >
>
> How about just having ReinitializeParallelWorkers which can be called
> only via vacuum even for the first time before the launch of workers
> as of now?
>
See in the attached what I have in mind. Few other comments:
1.
+ shared->disable_delay = (params->options & VACOPT_FAST);
This should be part of the third patch.
2.
+lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
+ LVRelStats *vacrelstats, LVParallelState *lps,
+ int nindexes)
{
..
..
+ /* Cap by the worker we computed at the beginning of parallel lazy vacuum */
+ nworkers = Min(nworkers, lps->pcxt->nworkers);
..
}
This should be Assert. In no case, the computed workers can be more
than what we have in context.
3.
+ if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
+ ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0))
+ nindexes_parallel_cleanup++;
I think the second condition should be VACUUM_OPTION_PARALLEL_COND_CLEANUP.
I have fixed the above comments and some given by me earlier [1] in
the attached patch. The attached patch is a diff on top of
v36-0002-Add-parallel-option-to-VACUUM-command.
Few other comments which I have not fixed:
4.
+ if (Irel[i]->rd_indam->amusemaintenanceworkmem)
+ nindexes_mwm++;
+
+ /* Skip indexes that don't participate parallel index vacuum */
+ if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
+ RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
+ continue;
Won't we need to worry about the number of indexes that uses
maintenance_work_mem only for indexes that can participate in a
parallel vacuum? If so, the above checks need to be reversed.
5.
/*
+ * Remember indexes that can participate parallel index vacuum and use
+ * it for index statistics initialization on DSM because the index
+ * size can get bigger during vacuum.
+ */
+ can_parallel_vacuum[i] = true;
I am not able to understand the second part of the comment ("because
the index size can get bigger during vacuum."). What is its
relevance?
6.
+/*
+ * Vacuum or cleanup indexes that can be processed by only the leader process
+ * because these indexes don't support parallel operation at that phase.
+ * Therefore this function must be called by the leader process.
+ */
+static void
+vacuum_indexes_leader(Relation *Irel, int nindexes,
IndexBulkDeleteResult **stats,
+ LVRelStats *vacrelstats, LVParallelState *lps)
{
..
Why you have changed the order of nindexes parameter? I think in the
previous patch, it was the last parameter and that seems to be better
place for it. Also, I think after the latest modifications, you can
remove the second sentence in the above comment ("Therefore this
function must be called by the leader process.).
7.
+ for (i = 0; i < nindexes; i++)
+ {
+ bool leader_only = (get_indstats(lps->lvshared, i) == NULL ||
+ skip_parallel_vacuum_index(Irel[i], lps->lvshared));
+
+ /* Skip the indexes that can be processed by parallel workers */
+ if (!leader_only)
+ continue;
It is better to name this parameter as skip_index or something like that.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
v36-0002-Add-parallel-option-to-VACUUM-command.diff.amit.patch | application/octet-stream | 9.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2019-12-18 10:07:07 | Optimizing TransactionIdIsCurrentTransactionId() |
Previous Message | Thomas Munro | 2019-12-18 10:02:02 | Re: Collation versions on Windows (help wanted, apply within) |