Re: [HACKERS] Block level parallel vacuum

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-10-06 10:59:27
Message-ID: CAA4eK1K+2qucdnyAk-eZ7zOezsyhNz8B6K0bOV_Ah9TouOi8-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 4, 2019 at 7:34 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
wrote:

> On Fri, Oct 4, 2019 at 2:02 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> >>
> >> I'd also prefer to use maintenance_work_mem at max during parallel
> >> vacuum regardless of the number of parallel workers. This is current
> >> implementation. In lazy vacuum the maintenance_work_mem is used to
> >> record itempointer of dead tuples. This is done by leader process and
> >> worker processes just refers them for vacuuming dead index tuples.
> >> Even if user sets a small amount of maintenance_work_mem the parallel
> >> vacuum would be helpful as it still would take a time for index
> >> vacuuming. So I thought we should cap the number of parallel workers
> >> by the number of indexes rather than maintenance_work_mem.
> >>
> >
> > Isn't that true only if we never use maintenance_work_mem during index
> cleanup? However, I think we are using during index cleanup, see forex.
> ginInsertCleanup. I think before reaching any conclusion about what to do
> about this, first we need to establish whether this is a problem. If I am
> correct, then only some of the index cleanups (like gin index) use
> maintenance_work_mem, so we need to consider that point while designing a
> solution for this.
> >
>
> I got your point. Currently the single process lazy vacuum could
> consume the amount of (maintenance_work_mem * 2) memory at max because
> we do index cleanup during holding the dead tuple space as you
> mentioned. And ginInsertCleanup is also be called at the beginning of
> ginbulkdelete. In current parallel lazy vacuum, each parallel vacuum
> worker could consume other memory apart from the memory used by heap
> scan depending on the implementation of target index AM. Given that
> the current single and parallel vacuum implementation it would be
> better to control the amount memory in total rather than the number of
> parallel workers. So one approach I came up with is that we make all
> vacuum workers use the amount of (maintenance_work_mem / # of
> participants) as new maintenance_work_mem.

Yeah, we can do something like that, but I am not clear whether the current
memory usage for Gin indexes is correct. I have started a new thread,
let's discuss there.

[1] -
https://www.postgresql.org/message-id/CAA4eK1LmcD5aPogzwim5Nn58Ki%2B74a6Edghx4Wd8hAskvHaq5A%40mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikolay Shaplov 2019-10-06 12:47:46 [PATCH] use separate PartitionedRelOptions structure to store partitioned table options
Previous Message Amit Kapila 2019-10-06 10:54:49 maintenance_work_mem used by Vacuum