Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-11-20 05:30:25
Message-ID: CA+fd4k7dj4m-BSKKutE+HQApdMXeBFsAo53xQ4vRrevcY5wDwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 18 Nov 2019 at 15:38, Masahiko Sawada
<masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>
> On Mon, 18 Nov 2019 at 15:34, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Mon, Nov 18, 2019 at 11:37 AM Masahiko Sawada
> > <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > >
> > > On Wed, 13 Nov 2019 at 14:31, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > >
> > > > Based on these needs, we came up with a way to allow users to specify
> > > > this information for IndexAm's. Basically, Indexam will expose a
> > > > variable amparallelvacuumoptions which can have below options
> > > >
> > > > VACUUM_OPTION_NO_PARALLEL 1 << 0 # vacuum (neither bulkdelete nor
> > > > vacuumcleanup) can't be performed in parallel
> > >
> > > I think VACUUM_OPTION_NO_PARALLEL can be 0 so that index AMs who don't
> > > want to support parallel vacuum don't have to set anything.
> > >
> >
> > make sense.
> >
> > > > VACUUM_OPTION_PARALLEL_BULKDEL 1 << 1 # bulkdelete can be done in
> > > > parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
> > > > flag)
> > > > VACUUM_OPTION_PARALLEL_COND_CLEANUP 1 << 2 # vacuumcleanup can be
> > > > done in parallel if bulkdelete is not performed (Indexes nbtree, brin,
> > > > gin, gist,
> > > > spgist, bloom will set this flag)
> > > > VACUUM_OPTION_PARALLEL_CLEANUP 1 << 3 # vacuumcleanup can be done in
> > > > parallel even if bulkdelete is already performed (Indexes gin, brin,
> > > > and bloom will set this flag)
> > >
> > > I think gin and bloom don't need to set both but should set only
> > > VACUUM_OPTION_PARALLEL_CLEANUP.
> > >
> > > And I'm going to disallow index AMs to set both
> > > VACUUM_OPTION_PARALLEL_COND_CLEANUP and VACUUM_OPTION_PARALLEL_CLEANUP
> > > by assertions, is that okay?
> > >
> >
> > Sounds reasonable to me.
> >
> > Are you planning to include the changes related to I/O throttling
> > based on the discussion in the nearby thread [1]? I think you can do
> > that if you agree with the conclusion in the last email[1], otherwise,
> > we can explore it separately.
>
> Yes I agreed. I'm going to include that changes in the next version
> patches. And I think we will be able to do more discussion based on
> the patch.
>

I've attached the latest version patch set. The patch set includes all
discussed points regarding index AM options as well as shared cost
balance. Also I added some test cases used all types of index AM.

During developments I had one concern about the number of parallel
workers to launch. In current design each index AMs can choose the
participation of parallel bulk-deletion and parallel cleanup. That
also means the number of parallel worker to launch might be different
for each time of parallel bulk-deletion and parallel cleanup. In
current patch the leader will always launch the number of indexes that
support either one but it would not be efficient in some cases. For
example, if we have 3 indexes supporting only parallel bulk-deletion
and 2 indexes supporting only parallel index cleanup, we would launch
5 workers for each execution but some workers will do nothing at all.
To deal with this problem, I wonder if we can improve the parallel
query so that the leader process creates a parallel context with the
maximum number of indexes and can launch a part of workers instead of
all of them.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v33-0001-Add-index-AM-field-and-callback-for-parallel-ind.patch application/octet-stream 14.9 KB
v33-0003-Add-paralell-P-option-to-vacuumdb-command.patch application/octet-stream 5.9 KB
v33-0002-Add-parallel-option-to-VACUUM-command.patch application/octet-stream 69.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-11-20 05:33:44 Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Previous Message Rushabh Lathia 2019-11-20 05:28:18 Re: backup manifests