Re: [HACKERS] Block level parallel vacuum

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: robertmhaas(at)gmail(dot)com
Cc: sawada(dot)mshk(at)gmail(dot)com, kommi(dot)haribabu(at)gmail(dot)com, amit(dot)kapila16(at)gmail(dot)com, michael(dot)paquier(at)gmail(dot)com, Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp, thomas(dot)munro(at)enterprisedb(dot)com, david(at)pgmasters(dot)net, klaussfreire(at)gmail(dot)com, simon(at)2ndquadrant(dot)com, pavan(dot)deolasee(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-03-26 07:46:36
Message-ID: 20190326.164636.146043883.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Thu, 21 Mar 2019 15:51:40 -0400, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in <CA+TgmobkRtLb5frmEF5t9U=d+iV9c5emtN+NrRS_xrHaH1Z20A(at)mail(dot)gmail(dot)com>
> On Tue, Mar 19, 2019 at 3:59 AM Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > The leader doesn't continue heap-scan while index vacuuming is
> > running. And the index-page-scan seems eat up CPU easily. If
> > index vacuum can run simultaneously with the next heap scan
> > phase, we can make index scan finishes almost the same time with
> > the next round of heap scan. It would reduce the (possible) CPU
> > contention. But this requires as the twice size of shared
> > memoryas the current implement.
>
> I think you're approaching this from the wrong point of view. If we
> have a certain amount of memory available, is it better to (a) fill
> the entire thing with dead tuples once, or (b) better to fill half of
> it with dead tuples, start index vacuuming, and then fill the other
> half of it with dead tuples for the next index-vacuum cycle while the
> current one is running? I think the answer is that (a) is clearly

Sure.

> better, because it results in half as many index vacuum cycles.

The "problem" I see there is it stops heap scanning on the leader
process. The leader cannot start the heap scan until the index
scan on workers end.

The heap scan is expected not to stop by the half-and-half
stratregy especially when the whole index pages are on
memory. But it is not always the case, of course.

> We can't really ask the user how much memory it's OK to use and then
> use twice as much. But if we could, what you're proposing here is
> probably still not the right way to use it.

Yes. I thought that I wrote that with such implication. "requires
as the twice size" has negative implications as you wrote above.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2019-03-26 07:56:04 RE: Re: reloption to prevent VACUUM from truncating empty pages at the end of relation
Previous Message Surafel Temesgen 2019-03-26 07:46:00 Re: Re: FETCH FIRST clause WITH TIES option