Re: [HACKERS] Block level parallel vacuum

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-11-12 09:26:33
Message-ID: CAFiTN-t3N0Y8D1yFMbViy6C-E60UB1QibPgHxwsvpDzD-bT1zA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 12, 2019 at 2:25 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Nov 12, 2019 at 7:43 AM Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> >
> > On Mon, 11 Nov 2019 at 19:29, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Mon, Nov 11, 2019 at 12:26 PM Masahiko Sawada
> > > <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > > >
> > > > After more thoughts, I think we can have a ternary value: never,
> > > > always, once. If it's 'never' the index never participates in parallel
> > > > cleanup. I guess hash indexes use 'never'. Next, if it's 'always' the
> > > > index always participates regardless of vacrelstats->num_index_scan. I
> > > > guess gin, brin and bloom use 'always'. Finally if it's 'once' the
> > > > index participates in parallel cleanup only when it's the first time
> > > > (that is, vacrelstats->num_index_scan == 0), I guess btree, gist and
> > > > spgist use 'once'.
> > > >
> > >
> > > I think this 'once' option is confusing especially because it also
> > > depends on 'num_index_scans' which the IndexAM has no control over.
> > > It might be that the option name is not good, but I am not sure.
> > > Another thing is that for brin indexes, we don't want bulkdelete to
> > > participate in parallelism.
> >
> > I thought brin should set amcanparallelvacuum is false and
> > amcanparallelcleanup is 'always'.
> >
>
> In that case, it is better to name the variable as amcanparallelbulkdelete.
>
> > > Do we want to have separate variables for
> > > ambulkdelete and amvacuumcleanup which decides whether the particular
> > > phase can be done in parallel?
> >
> > You mean adding variables to ambulkdelete and amvacuumcleanup as
> > function arguments?
> >
>
> No, I mean separate variables amcanparallelbulkdelete (bool) and
> amcanparallelvacuumcleanup (unit16) variables.
>
> >
> > > Another possibility could be to just
> > > have one variable (say uint16 amparallelvacuum) which will tell us all
> > > the options but I don't think that will be a popular approach
> > > considering all the other methods and variables exposed. What do you
> > > think?
> >
> > Adding only one variable that can have flags would also be a good
> > idea, instead of having multiple variables for each option. For
> > instance FDW API uses such interface (see eflags of BeginForeignScan).
> >
>
> Yeah, maybe something like amparallelvacuumoptions. The options can be:
>
> VACUUM_OPTION_NO_PARALLEL 0 # vacuum (neither bulkdelete nor
> vacuumcleanup) can't be performed in parallel
> VACUUM_OPTION_NO_PARALLEL_CLEANUP 1 # vacuumcleanup cannot be
> performed in parallel (hash index will set this flag)

Maybe we don't want this option? because if 3 or 4 is not set then we
will not do the cleanup in parallel right?

> VACUUM_OPTION_PARALLEL_BULKDEL 2 # bulkdelete can be done in
> parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
> flag)
> VACUUM_OPTION_PARALLEL_COND_CLEANUP 3 # vacuumcleanup can be done in
> parallel if bulkdelete is not performed (Indexes nbtree, brin, hash,
> gin, gist, spgist, bloom will set this flag)
> VACUUM_OPTION_PARALLEL_CLEANUP 4 # vacuumcleanup can be done in
> parallel even if bulkdelete is already performed (Indexes gin, brin,
> and bloom will set this flag)
>
> Does something like this make sense?
Yeah, something like that seems better to me.

> If we all agree on this, then I
> think we can summarize the part of the discussion related to this API
> and get feedback from a broader audience.

Make sense.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2019-11-12 09:33:26 Re: cost based vacuum (parallel)
Previous Message Julien Rouhaud 2019-11-12 09:07:28 Re: Monitoring disk space from within the server