Quick Links

Re: [HACKERS] Block level parallel vacuum

From:	Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <langote_amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Block level parallel vacuum
Date:	2019-12-13 04:33:13
Message-ID:	CA+fd4k7+Ak1e9HD5NB9SwKU4228j8=-8fRE7mvu072d8Jjdhsg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Sorry for the late reply.

On Fri, 6 Dec 2019 at 14:20, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Dec 5, 2019 at 7:44 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > I think it might be a good idea to change what we expect index AMs to
> > do rather than trying to make anything that they happen to be doing
> > right now work, no matter how crazy. In particular, suppose we say
> > that you CAN'T add data on to the end of IndexBulkDeleteResult any
> > more, and that instead the extra data is passed through a separate
> > parameter. And then you add an estimate method that gives the size of
> > the space provided by that parameter (and if the estimate method isn't
> > defined then the extra parameter is passed as NULL) and document that
> > the data stored there might get flat-copied.
> >
>
> I think this is a good idea and serves the purpose we are trying to
> achieve currently. However, if there are any IndexAM that is using
> the current way to pass stats with additional information, they would
> need to change even if they don't want to use parallel vacuum
> functionality (say because their indexes are too small or whatever
> other reasons). I think this is a reasonable trade-off and the
> changes on their end won't be that big. So, we should do this.
>
> > Now, you've taken the
> > onus off of parallel vacuum to cope with any crazy thing a
> > hypothetical AM might be doing, and instead you've defined the
> > behavior of that hypothetical AM as wrong. If somebody really needs
> > that, it's now their job to modify the index AM machinery further
> > instead of your job to somehow cope.
> >
>
> makes sense.
>
> > > Here, we have a need to reduce the number of workers. Index Vacuum
> > > has two different phases (index vacuum and index cleanup) which uses
> > > the same parallel-context/DSM but both could have different
> > > requirements for workers. The second phase (cleanup) would normally
> > > need fewer workers as if the work is done in the first phase, second
> > > wouldn't need it, but we have exceptions like gin indexes where we
> > > need it for the second phase as well because it takes the pass
> > > over-index again even if we have cleaned the index in the first phase.
> > > Now, consider the case where we have 3 btree indexes and 2 gin
> > > indexes, we would need 5 workers for index vacuum phase and 2 workers
> > > for index cleanup phase. There are other cases too.
> > >
> > > We also considered to have a separate DSM for each phase, but that
> > > appeared to have overhead without much benefit.
> >
> > How about adding an additional argument to ReinitializeParallelDSM()
> > that allows the number of workers to be reduced? That seems like it
> > would be less confusing than what you have now, and would involve
> > modify code in a lot fewer places.
> >
>
> Yeah, we can do that. We can maintain some information in
> LVParallelState which indicates whether we need to reinitialize the
> DSM before launching workers. Sawada-San, do you see any problem with
> this idea?

I think the number of workers could be increased in cleanup phase. For
example, if we have 1 brin index and 2 gin indexes then in bulkdelete
phase we need only 1 worker but in cleanup we need 2 workers.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Re: [HACKERS] Block level parallel vacuum at 2019-12-06 05:20:33 from Amit Kapila

Responses

Re: [HACKERS] Block level parallel vacuum at 2019-12-13 05:18:49 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Thomas Munro	2019-12-13 04:41:56	Re: checkpointer: PANIC: could not fsync file: No such file or directory
Previous Message	Kyotaro Horiguchi	2019-12-13 04:07:22	Re: archive status ".ready" files may be created too early