Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <langote_amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2020-01-10 05:13:20
Message-ID: CA+fd4k4Gi1yrrSbfb_8gbOLYeMOi7ZaKv1n0c9aUFn8nJo4Wng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 9 Jan 2020 at 19:33, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Jan 9, 2020 at 10:41 AM Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> >
> > On Wed, 8 Jan 2020 at 22:16, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > >
> > > What do you think of the attached? Sawada-san, kindly verify the
> > > changes and let me know your opinion.
> >
> > I agreed to not include both the FAST option patch and
> > DISABLE_LEADER_PARTICIPATION patch at this stage. It's better to focus
> > on the main part and we can discuss and add them later if want.
> >
> > I've looked at the latest version patch you shared. Overall it looks
> > good and works fine. I have a few small comments:
> >
>
> I have addressed all your comments and slightly change nearby comments
> and ran pgindent. I think we can commit the first two preparatory
> patches now unless you or someone else has any more comments on those.

Yes.

I'd like to briefly summarize the
v43-0002-Allow-vacuum-command-to-process-indexes-in-parallel for other
reviewers who wants to newly starts to review this patch:

Introduce PARALLEL option to VACUUM command. Parallel vacuum is
enabled by default. The number of parallel workers is determined based
on the number of indexes that support parallel index when user didn't
specify the parallel degree or PARALLEL option is omitted. Specifying
PARALLEL 0 disables parallel vacuum.

In parallel vacuum of this patch, only the leader process does heap
scan and collect dead tuple TIDs on the DSM segment. Before starting
index vacuum or index cleanup the leader launches the parallel workers
and perform it together with parallel workers. Individual index are
processed by one vacuum worker process. Therefore parallel vacuum can
be used when the table has at least 2 indexes (the leader always takes
one index). After launched parallel workers, the leader process
vacuums indexes first that don't support parallel index after launched
parallel workers. The parallel workers process indexes that support
parallel index vacuum and the leader process join as a worker after
completing such indexes. Once all indexes are processed the parallel
worker processes exit. After that, the leader process re-initializes
the parallel context so that it can use the same DSM for multiple
passes of index vacuum and for performing index cleanup. For updating
the index statistics, we need to update the system table and since
updates are not allowed during parallel mode we update the index
statistics after exiting from the parallel mode.

When the vacuum cost-based delay is enabled, even parallel vacuum is
throttled. The basic idea of a cost-based vacuum delay for parallel
index vacuuming is to allow all parallel vacuum workers including the
leader process to have a shared view of cost related parameters
(mainly VacuumCostBalance). We allow each worker to update it as and
when it has incurred any cost and then based on that decide whether it
needs to sleep. We allow the worker to sleep proportional to the work
done and reduce the VacuumSharedCostBalance by the amount which is
consumed by the current worker (VacuumCostBalanceLocal). This can
avoid letting the workers sleep who have done less or no I/O as
compared to other workers and therefore can ensure that workers who
are doing more I/O got throttled more.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-01-10 05:16:37 Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Previous Message Mahendra Singh Thalor 2020-01-10 05:08:50 Re: [HACKERS] Block level parallel vacuum