Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <langote_amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-12-13 05:38:03
Message-ID: CA+fd4k4hbktv9uG11_PpM02Jo+sWd7AMVsExmEP8tjT0po2qUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 13 Dec 2019 at 14:19, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Dec 13, 2019 at 10:03 AM Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> >
> > Sorry for the late reply.
> >
> > On Fri, 6 Dec 2019 at 14:20, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > >
> > > > > Here, we have a need to reduce the number of workers. Index Vacuum
> > > > > has two different phases (index vacuum and index cleanup) which uses
> > > > > the same parallel-context/DSM but both could have different
> > > > > requirements for workers. The second phase (cleanup) would normally
> > > > > need fewer workers as if the work is done in the first phase, second
> > > > > wouldn't need it, but we have exceptions like gin indexes where we
> > > > > need it for the second phase as well because it takes the pass
> > > > > over-index again even if we have cleaned the index in the first phase.
> > > > > Now, consider the case where we have 3 btree indexes and 2 gin
> > > > > indexes, we would need 5 workers for index vacuum phase and 2 workers
> > > > > for index cleanup phase. There are other cases too.
> > > > >
> > > > > We also considered to have a separate DSM for each phase, but that
> > > > > appeared to have overhead without much benefit.
> > > >
> > > > How about adding an additional argument to ReinitializeParallelDSM()
> > > > that allows the number of workers to be reduced? That seems like it
> > > > would be less confusing than what you have now, and would involve
> > > > modify code in a lot fewer places.
> > > >
> > >
> > > Yeah, we can do that. We can maintain some information in
> > > LVParallelState which indicates whether we need to reinitialize the
> > > DSM before launching workers. Sawada-San, do you see any problem with
> > > this idea?
> >
> > I think the number of workers could be increased in cleanup phase. For
> > example, if we have 1 brin index and 2 gin indexes then in bulkdelete
> > phase we need only 1 worker but in cleanup we need 2 workers.
> >
>
> I think it shouldn't be more than the number with which we have
> created a parallel context, no? If that is the case, then I think it
> should be fine.

Right. I thought that ReinitializeParallelDSM() with an additional
argument would reduce DSM but I understand that it doesn't actually
reduce DSM but just have a variable for the number of workers to
launch, is that right? And we also would need to call
ReinitializeParallelDSM() at the beginning of vacuum index or vacuum
cleanup since we don't know that we will do either index vacuum or
index cleanup, at the end of index vacum.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2019-12-13 05:39:32 pg_ls_tmpdir to show shared filesets
Previous Message Craig Ringer 2019-12-13 05:33:49 Re: Questions about PostgreSQL implementation details