Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-11-13 01:22:56
Message-ID: CA+fd4k5WG+4BWKsEkKW=9WsoorJLeNVFoZ899TUDwsbinxRHtw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 12 Nov 2019 at 22:33, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Nov 12, 2019 at 5:30 PM Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> >
> > On Tue, 12 Nov 2019 at 20:11, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Nov 12, 2019 at 3:39 PM Masahiko Sawada
> > > <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > > >
> > > > On Tue, 12 Nov 2019 at 18:26, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Tue, Nov 12, 2019 at 2:25 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > >
> > > > > > Yeah, maybe something like amparallelvacuumoptions. The options can be:
> > > > > >
> > > > > > VACUUM_OPTION_NO_PARALLEL 0 # vacuum (neither bulkdelete nor
> > > > > > vacuumcleanup) can't be performed in parallel
> > > > > > VACUUM_OPTION_NO_PARALLEL_CLEANUP 1 # vacuumcleanup cannot be
> > > > > > performed in parallel (hash index will set this flag)
> > > > >
> > > > > Maybe we don't want this option? because if 3 or 4 is not set then we
> > > > > will not do the cleanup in parallel right?
> > > > >
> > >
> > > Yeah, but it is better to be explicit about this.
> >
> > VACUUM_OPTION_NO_PARALLEL_BULKDEL is missing?
> >
>
> I am not sure if that is required.
>
> > I think brin indexes
> > will use this flag.
> >
>
> Brin index can set VACUUM_OPTION_PARALLEL_CLEANUP in my proposal and
> it should work.
>
> > It will end up with
> > (VACUUM_OPTION_NO_PARALLEL_CLEANUP |
> > VACUUM_OPTION_NO_PARALLEL_BULKDEL) is equivalent to
> > VACUUM_OPTION_NO_PARALLEL, though.
> >
> > >
> > > > > > VACUUM_OPTION_PARALLEL_BULKDEL 2 # bulkdelete can be done in
> > > > > > parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
> > > > > > flag)
> > > > > > VACUUM_OPTION_PARALLEL_COND_CLEANUP 3 # vacuumcleanup can be done in
> > > > > > parallel if bulkdelete is not performed (Indexes nbtree, brin, hash,
> > > > > > gin, gist, spgist, bloom will set this flag)
> > > > > > VACUUM_OPTION_PARALLEL_CLEANUP 4 # vacuumcleanup can be done in
> > > > > > parallel even if bulkdelete is already performed (Indexes gin, brin,
> > > > > > and bloom will set this flag)
> > > > > >
> > > > > > Does something like this make sense?
> > > >
> > > > 3 and 4 confused me because 4 also looks conditional. How about having
> > > > two flags instead: one for doing parallel cleanup when not performed
> > > > yet (VACUUM_OPTION_PARALLEL_COND_CLEANUP) and another one for doing
> > > > always parallel cleanup (VACUUM_OPTION_PARALLEL_CLEANUP)?
> > > >
> > >
> > > Hmm, this is exactly what I intend to say with 3 and 4. I am not sure
> > > what makes you think 4 is conditional.
> >
> > Hmm so why gin and bloom will set 3 and 4 flags? I thought if it sets
> > 4 it doesn't need to set 3 because 4 means always doing cleanup in
> > parallel.
> >
>
> Yeah, that makes sense. They can just set 4.

Okay,

>
> > >
> > > > That way, we
> > > > can have flags as follows and index AM chooses two flags, one from the
> > > > first two flags for bulk deletion and another from next three flags
> > > > for cleanup.
> > > >
> > > > VACUUM_OPTION_PARALLEL_NO_BULKDEL 1 << 0
> > > > VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1
> > > > VACUUM_OPTION_PARALLEL_NO_CLEANUP 1 << 2
> > > > VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 3
> > > > VACUUM_OPTION_PARALLEL_CLEANUP 1 << 4
> > > >
> > >
> > > This also looks reasonable, but if there is an index that doesn't want
> > > to support a parallel vacuum, it needs to set multiple flags.
> >
> > Right. It would be better to use uint16 as two uint8. I mean that if
> > first 8 bits are 0 it means VACUUM_OPTION_PARALLEL_NO_BULKDEL and if
> > next 8 bits are 0 means VACUUM_OPTION_PARALLEL_NO_CLEANUP. Other flags
> > could be followings:
> >
> > VACUUM_OPTION_PARALLEL_BULKDEL 0x0001
> > VACUUM_OPTION_PARALLEL_COND_CLEANUP 0x0100
> > VACUUM_OPTION_PARALLEL_CLEANUP 0x0200
> >
>
> Hmm, I think we should define these flags in the most simple way.
> Your previous proposal sounds okay to me.

Okay. As you mentioned before, my previous proposal won't work for
existing index AMs that don't set amparallelvacuumoptions. But since we
have amcanparallelvacuum which is false by default I think we don't
need to worry about backward compatibility problem. The existing index
AM will use neither parallel bulk-deletion nor parallel cleanup by
default. When it wants to support parallel vacuum they will set
amparallelvacuumoptions as well as amcanparallelvacuum.

I'll try to use my previous proposal and check it. If something wrong
we can back to your proposal or others.

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2019-11-13 01:25:47 Re: Why overhead of SPI is so large?
Previous Message Kyotaro Horiguchi 2019-11-13 00:42:43 Re: PHJ file leak.