Re: [HACKERS] Block level parallel vacuum

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2018-11-27 02:25:50
Message-ID: CAA4eK1KUmePSZ+Yc_UoWMa7m3FNLNiaQCwJYap3vFTHJmWXmeQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 26, 2018 at 2:08 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Sun, Nov 25, 2018 at 2:35 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Sat, Nov 24, 2018 at 5:47 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > On Tue, Oct 30, 2018 at 2:04 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > >
>
> Thank you for the comment.
>
> > > I could see that you have put a lot of effort on this patch and still
> > > we are not able to make much progress mainly I guess because of
> > > relation extension lock problem. I think we can park that problem for
> > > some time (as already we have invested quite some time on it), discuss
> > > a bit about actual parallel vacuum patch and then come back to it.
> > >
> >
> > Today, I was reading this and previous related thread [1] and it seems
> > to me multiple people Andres [2], Simon [3] have pointed out that
> > parallelization for index portion is more valuable. Also, some of the
> > results [4] indicate the same. Now, when there are no indexes,
> > parallelizing heap scans also have benefit, but I think in practice we
> > will see more cases where the user wants to vacuum tables with
> > indexes. So how about if we break this problem in the following way
> > where each piece give the benefit of its own:
> > (a) Parallelize index scans wherein the workers will be launched only
> > to vacuum indexes. Only one worker per index will be spawned.
> > (b) Parallelize per-index vacuum. Each index can be vacuumed by
> > multiple workers.
> > (c) Parallelize heap scans where multiple workers will scan the heap,
> > collect dead TIDs and then launch multiple workers for indexes.
> >
> > I think if we break this problem into multiple patches, it will reduce
> > the scope of each patch and help us in making progress. Now, it's
> > been more than 2 years that we are trying to solve this problem, but
> > still didn't make much progress. I understand there are various
> > genuine reasons and all of that work will help us in solving all the
> > problems in this area. How about if we first target problem (a) and
> > once we are done with that we can see which of (b) or (c) we want to
> > do first?
>
> Thank you for suggestion. It seems good to me. We would get a nice
> performance scalability even by only (a), and vacuum will get more
> powerful by (b) or (c). Also, (a) would not require to resovle the
> relation extension lock issue IIUC.
>

Yes, I also think so. We do acquire 'relation extension lock' during
index vacuum, but as part of (a), we are talking one worker per-index,
so there shouldn't be a problem with respect to deadlocks.

> I'll change the patch and submit
> to the next CF.
>

Okay.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-11-27 02:40:25 Re: A WalSnd issue related to state WALSNDSTATE_STOPPING
Previous Message Paul Guo 2018-11-27 02:07:04 Re: A WalSnd issue related to state WALSNDSTATE_STOPPING