Quick Links

Re: [HACKERS] Block level parallel vacuum

From:	Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To:	Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Block level parallel vacuum
Date:	2019-11-13 04:21:26
Message-ID:	CAFiTN-uUginUE5myOcx0J=yvjEMObpTRZeqBj4xBxVe_DgOJpA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Nov 13, 2019 at 9:12 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, Nov 12, 2019 at 5:31 PM Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> >
> > On Tue, 12 Nov 2019 at 20:29, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Nov 12, 2019 at 4:04 PM Masahiko Sawada
> > > <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > > >
> > > > On Mon, 11 Nov 2019 at 17:57, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > > I realized that v31-0006 patch doesn't work fine so I've attached the
> > > > > > updated version patch that also incorporated some comments I got so
> > > > > > far. Sorry for the inconvenience. I'll apply your 0001 patch and also
> > > > > > test the total delay time.
> > > > > >
> > > > > While reviewing the 0002, I got one doubt related to how we are
> > > > > dividing the maintainance_work_mem
> > > > >
> > > > > +prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
> > > > > +{
> > > > > + /* Compute the new maitenance_work_mem value for index vacuuming */
> > > > > + lvshared->maintenance_work_mem_worker =
> > > > > + (nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm :
> > > > > maintenance_work_mem;
> > > > > +}
> > > > > Is it fair to just consider the number of indexes which use
> > > > > maintenance_work_mem? Or we need to consider the number of worker as
> > > > > well. My point is suppose there are 10 indexes which will use the
> > > > > maintenance_work_mem but we are launching just 2 workers then what is
> > > > > the point in dividing the maintenance_work_mem by 10.
> > > > >
> > > > > IMHO the calculation should be like this
> > > > > lvshared->maintenance_work_mem_worker = (nindexes_mwm > 0) ?
> > > > > maintenance_work_mem / Min(nindexes_mwm, nworkers) :
> > > > > maintenance_work_mem;
> > > > >
> > > > > Am I missing something?
> > > >
> > > > No, I think you're right. On the other hand I think that dividing it
> > > > by the number of indexes that will use the mantenance_work_mem makes
> > > > sense when parallel degree > the number of such indexes. Suppose the
> > > > table has 2 indexes and there are 10 workers then we should divide the
> > > > maintenance_work_mem by 2 rather than 10 because it's possible that at
> > > > most 2 indexes that uses the maintenance_work_mem are processed in
> > > > parallel at a time.
> > > >
> > > Right, thats the reason I suggested divide with Min(nindexes_mwm, nworkers).
> >
> > Thanks! I'll fix it in the next version patch.
> >
> One more comment.
>
> +lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
> + int nindexes, IndexBulkDeleteResult **stats,
> + LVParallelState *lps)
> +{
> + ....
>
> + if (ParallelVacuumIsActive(lps))
> + {
>
> +
> + lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
> + stats, lps);
> +
> + }
> +
> + for (idx = 0; idx < nindexes; idx++)
> + {
> + /*
> + * Skip indexes that we have already vacuumed during parallel index
> + * vacuuming.
> + */
> + if (ParallelVacuumIsActive(lps) && !IndStatsIsNull(lps->lvshared, idx))
> + continue;
> +
> + lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
> + vacrelstats->old_live_tuples);
> + }
> +}
>
> In this function, if ParallelVacuumIsActive, we perform the parallel
> vacuum for all the index for which parallel vacuum is supported and
> once that is over we finish vacuuming remaining indexes for which
> parallel vacuum is not supported. But, my question is that inside
> lazy_parallel_vacuum_or_cleanup_indexes, we wait for all the workers
> to finish their job then only we start with the sequential vacuuming
> shouldn't we start that immediately as soon as the leader
> participation is over in the parallel vacuum?
>

+ /*
+ * Since parallel workers cannot access data in temporary tables, parallel
+ * vacuum is not allowed for temporary relation.
+ */
+ if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+ {
+ ereport(WARNING,
+ (errmsg("skipping vacuum on \"%s\" --- cannot vacuum temporary
tables in parallel",
+ RelationGetRelationName(onerel))));
+ relation_close(onerel, lmode);
+ PopActiveSnapshot();
+ CommitTransactionCommand();
+ /* It's OK to proceed with ANALYZE on this table */
+ return true;
+ }
+

If we can not support the parallel vacuum for the temporary table then
shouldn't we fall back to the normal vacuum instead of skipping the
table. I think it's not fair that if the user has given system-wide
parallel vacuum then all the temp table will be skipped and not at all
vacuumed then user need to again perform normal vacuum on those
tables.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Re: [HACKERS] Block level parallel vacuum at 2019-11-13 03:42:57 from Dilip Kumar

Responses

Re: [HACKERS] Block level parallel vacuum at 2019-11-14 04:10:54 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Masahiko Sawada	2019-11-13 04:32:13	Re: cost based vacuum (parallel)
Previous Message	iwata.aya@fujitsu.com	2019-11-13 04:20:11	RE: libpq debug log