Re: [HACKERS] Block level parallel vacuum

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <langote_amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-12-06 05:20:33
Message-ID: CAA4eK1J0hiKJctkjouu8qTCNfBzbW303BiCkU7f_qUzzY6+rHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 5, 2019 at 7:44 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> I think it might be a good idea to change what we expect index AMs to
> do rather than trying to make anything that they happen to be doing
> right now work, no matter how crazy. In particular, suppose we say
> that you CAN'T add data on to the end of IndexBulkDeleteResult any
> more, and that instead the extra data is passed through a separate
> parameter. And then you add an estimate method that gives the size of
> the space provided by that parameter (and if the estimate method isn't
> defined then the extra parameter is passed as NULL) and document that
> the data stored there might get flat-copied.
>

I think this is a good idea and serves the purpose we are trying to
achieve currently. However, if there are any IndexAM that is using
the current way to pass stats with additional information, they would
need to change even if they don't want to use parallel vacuum
functionality (say because their indexes are too small or whatever
other reasons). I think this is a reasonable trade-off and the
changes on their end won't be that big. So, we should do this.

> Now, you've taken the
> onus off of parallel vacuum to cope with any crazy thing a
> hypothetical AM might be doing, and instead you've defined the
> behavior of that hypothetical AM as wrong. If somebody really needs
> that, it's now their job to modify the index AM machinery further
> instead of your job to somehow cope.
>

makes sense.

> > Here, we have a need to reduce the number of workers. Index Vacuum
> > has two different phases (index vacuum and index cleanup) which uses
> > the same parallel-context/DSM but both could have different
> > requirements for workers. The second phase (cleanup) would normally
> > need fewer workers as if the work is done in the first phase, second
> > wouldn't need it, but we have exceptions like gin indexes where we
> > need it for the second phase as well because it takes the pass
> > over-index again even if we have cleaned the index in the first phase.
> > Now, consider the case where we have 3 btree indexes and 2 gin
> > indexes, we would need 5 workers for index vacuum phase and 2 workers
> > for index cleanup phase. There are other cases too.
> >
> > We also considered to have a separate DSM for each phase, but that
> > appeared to have overhead without much benefit.
>
> How about adding an additional argument to ReinitializeParallelDSM()
> that allows the number of workers to be reduced? That seems like it
> would be less confusing than what you have now, and would involve
> modify code in a lot fewer places.
>

Yeah, we can do that. We can maintain some information in
LVParallelState which indicates whether we need to reinitialize the
DSM before launching workers. Sawada-San, do you see any problem with
this idea?

> > > Is there any legitimate use case for parallel vacuum in combination
> > > with vacuum cost delay?
> > >
> >
> > Yeah, we also initially thought that it is not legitimate to use a
> > parallel vacuum with a cost delay. But to get a wider view, we
> > started a separate thread [2] and there we reach to the conclusion
> > that we need a solution for throttling [3].
>
> OK, thanks for the pointer. This doesn't address the other part of my
> complaint, though, which is that the whole discussion between you and
> Dilip and Sawada-san presumes that you want the delays ought to be
> scattered across the workers roughly in proportion to their share of
> the I/O, and it seems NOT AT ALL clear that this is actually a
> desirable property. You're all assuming that, but none of you has
> justified it, and I think the opposite might be true in some cases.
>

IIUC, your complaint is that in some cases, even if the I/O rate is
enough for one worker, we will still launch more workers and throttle
them. The point is we can't know in advance how much I/O is required
for each index. We can try to do that based on index size, but I
don't think that will be right because it is possible that for the
bigger index, we don't need to dirty the pages and most of the pages
are in shared buffers, etc. The current algorithm won't use more I/O
than required and it will be good for cases where one or some of the
indexes are doing more I/O as compared to others and it will also work
equally good even when the indexes have a similar amount of work. I
think we could do better if we can predict how much I/O each index
requires before actually scanning the index.

I agree with the other points (add a FAST option for parallel vacuum
and document that parallel vacuum is still potentially throttled ...)
you made in a separate email.

> You're adding extra complexity for something that isn't a clear
> improvement.
>
> > Your understanding is correct. How about if we modify it to something
> > like: "Note that parallel workers are alive only during index vacuum
> > or index cleanup but the leader process neither exits from the
> > parallel mode nor destroys the parallel context until the entire
> > parallel operation is finished." OR something like "The leader backend
> > holds the parallel context till the index vacuum and cleanup is
> > finished. Both index vacuum and cleanup separately perform the work
> > with parallel workers."
>
> How about if you just delete it? You don't need a comment explaining
> that this caller of CreateParallelContext() does something which
> *every* caller of CreateParallelContext() must do. If you didn't do
> that, you'd fail assertions and everything would break, so *of course*
> you are doing it.
>

Fair enough, we can just remove this part of the comment.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2019-12-06 05:53:45 closesocket behavior in different platforms
Previous Message Michael Paquier 2019-12-06 05:09:05 Re: Increase footprint of %m and reduce strerror()