Re: maintenance_work_mem used by Vacuum

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: maintenance_work_mem used by Vacuum
Date: 2019-10-10 06:36:02
Message-ID: CAA4eK1L+CnmZKN_jeDkbRK7D49pW8C67XFAXd7gTNNFQ_xajkA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 10, 2019 at 9:58 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Wed, Oct 9, 2019 at 7:12 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > I think the current situation is not good but if we try to cap it to
> > maintenance_work_mem + gin_*_work_mem then also I don't think it will
> > make the situation much better. However, I think the idea you
> > proposed up-thread[1] is better. At least the maintenance_work_mem
> > will be the top limit what the auto vacuum worker can use.
> >
>
> I'm concerned that there are other index AMs that could consume more
> memory like GIN. In principle we can vacuum third party index AMs and
> will be able to even parallel vacuum them. I expect that
> maintenance_work_mem is the top limit of memory usage of maintenance
> command but actually it's hard to set the limit to memory usage of
> bulkdelete and cleanup by the core. So I thought that since GIN is the
> one of the index AM it can have a new parameter to make its job
> faster. If we have that parameter it might not make the current
> situation much better but user will be able to set a lower value to
> that parameter to not use the memory much while keeping the number of
> index vacuums.
>

I can understand your concern why dividing maintenance_work_mem for
vacuuming heap and cleaning up the index might be tricky especially
because of third party indexes, but introducing new Guc isn't free
either. I think that should be the last resort and we need buy-in
from more people for that. Did you consider using work_mem for this?
And even if we want to go with a new guc, maybe it is better to have
some generic name like maintenance_index_work_mem or something along
those lines so that it can be used for other index cleanups as well if
required.

Tom, Teodor, do you have any opinion on this matter? This has been
introduced by commit:

commit ff301d6e690bb5581502ea3d8591a1600fd87acc
Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Date: Tue Mar 24 20:17:18 2009 +0000

Implement "fastupdate" support for GIN indexes, in which we try to
accumulate multiple index entries in a holding area before adding them
to the main index structure. This helps because bulk insert is
(usually) significantly faster than retail insert for GIN.
..
..
Teodor Sigaev

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2019-10-10 06:50:05 Re: [PATCH] use separate PartitionedRelOptions structure to store partitioned table options
Previous Message Amit Langote 2019-10-10 06:28:45 Re: adding partitioned tables to publications