Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>, Amit Langote <langote_amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Claudio Freire <klaussfreire(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2020-01-15 04:34:33
Message-ID: CA+fd4k4ofFKrC5DM0BS27n3adC8g3_DKQixi1-YUSZh1Rw7nMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 14 Jan 2020 at 21:43, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Jan 14, 2020 at 10:04 AM Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> >
> > On Mon, 13 Jan 2020 at 12:50, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Sat, Jan 11, 2020 at 7:48 PM Masahiko Sawada
> > > <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > >
> > > Okay, would it better if we get rid of this variable and have code like below?
> > >
> > > /* Skip the indexes that can be processed by parallel workers */
> > > if ( !(get_indstats(lps->lvshared, i) == NULL ||
> > > skip_parallel_vacuum_index(Irel[i], lps->lvshared)))
> > > continue;
> >
> > Make sense to me.
> >
>
> I have changed the comment and condition to make it a positive test so
> that it is more clear.
>
> > > ...
> > > > Agreed. But with the updated patch the PARALLEL option without the
> > > > parallel degree doesn't display warning because params->nworkers = 0
> > > > in that case. So how about restoring params->nworkers at the end of
> > > > vacuum_rel()?
> > > >
> > >
> > > I had also thought on those lines, but I was not entirely sure about
> > > this resetting of workers. Today, again thinking about it, it seems
> > > the idea Mahendra is suggesting that is giving an error if the
> > > parallel degree is not specified seems reasonable to me. This means
> > > Vacuum (parallel), Vacuum (parallel) <tbl_name>, etc. will give an
> > > error "parallel degree must be specified". This idea has merit as now
> > > we are supporting a parallel vacuum by default, so a 'parallel' option
> > > without a parallel degree doesn't have any meaning. If we do that,
> > > then we don't need to do anything additional about the handling of
> > > temp tables (other than what patch is already doing) as well. What do
> > > you think?
> > >
> >
> > Good point! Agreed.
> >
>
> Thanks, changed accordingly.
>

Thank you for updating the patch! I have a few small comments. The
rest looks good to me.

1.
+ * Compute the number of parallel worker processes to request. Both index
+ * vacuum and index cleanup can be executed with parallel workers. The
+ * relation size of the table don't affect the parallel degree for now.

s/don't/doesn't/

2.
@@ -383,6 +435,7 @@ vacuum(List *relations, VacuumParams *params,
VacuumPageHit = 0;
VacuumPageMiss = 0;
VacuumPageDirty = 0;
+ VacuumSharedCostBalance = NULL;

I think we can initialize VacuumCostBalanceLocal and
VacuumActiveNWorkers here. We use these parameters during parallel
index vacuum and reset at the end but we might want to initialize them
for safety.

3.
+ /* Set cost-based vacuum delay */
+ VacuumCostActive = (VacuumCostDelay > 0);
+ VacuumCostBalance = 0;
+ VacuumPageHit = 0;
+ VacuumPageMiss = 0;
+ VacuumPageDirty = 0;
+ VacuumSharedCostBalance = &(lvshared->cost_balance);
+ VacuumActiveNWorkers = &(lvshared->active_nworkers);

VacuumCostBalanceLocal also needs to be initialized.

4.
The regression tests don't have the test case of PARALLEL 0.

Since I guess you already modifies the code locally I've attached the
diff containing the above review comments.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
review_v47_masahiko.patch application/octet-stream 2.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2020-01-15 04:36:30 Re: backup manifests
Previous Message Kyotaro Horiguchi 2020-01-15 04:02:27 Re: pause recovery if pitr target not reached