Re: [HACKERS] Block level parallel vacuum

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>, Amit Langote <langote_amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Claudio Freire <klaussfreire(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2020-01-17 04:06:41
Message-ID: CAFiTN-veA-GTb7FqRtVC6LRNkNbEXM9S9=b3G_FQeS6UiKZieQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 16, 2020 at 5:34 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Jan 16, 2020 at 4:46 PM Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> >
> > Right. Most indexes (all?) of tables that are used in the regression
> > tests are smaller than min_parallel_index_scan_size. And we set
> > min_parallel_index_scan_size to 0 in vacuum.sql but VACUUM would not
> > be speeded-up much because of the relation size. Since we instead
> > populate new table for parallel vacuum testing the regression test for
> > vacuum would take a longer time.
> >
>
> Fair enough and I think it is good in a way that it won't change the
> coverage of existing vacuum code. I have fixed all the issues
> reported by Mahendra and have fixed a few other cosmetic things in the
> attached patch.
>
I have few small comments.

1.
logical streaming for large in-progress transactions+
+ /* Can't perform vacuum in parallel */
+ if (parallel_workers <= 0)
+ {
+ pfree(can_parallel_vacuum);
+ return lps;
+ }

why are we checking parallel_workers <= 0, Function
compute_parallel_vacuum_workers only returns 0 or greater than 0
so isn't it better to just check if (parallel_workers == 0) ?

2.
+/*
+ * Macro to check if we are in a parallel vacuum. If true, we are in the
+ * parallel mode and the DSM segment is initialized.
+ */
+#define ParallelVacuumIsActive(lps) (((LVParallelState *) (lps)) != NULL)

(LVParallelState *) (lps) -> this typecast is not required, just (lps)
!= NULL should be enough.

3.

+ shared->offset = MAXALIGN(add_size(SizeOfLVShared, BITMAPLEN(nindexes)));
+ prepare_index_statistics(shared, can_parallel_vacuum, nindexes);
+ pg_atomic_init_u32(&(shared->idx), 0);
+ pg_atomic_init_u32(&(shared->cost_balance), 0);
+ pg_atomic_init_u32(&(shared->active_nworkers), 0);

I think it will look cleaner if we can initialize in the order they
are declared in structure.

4.
+ VacuumSharedCostBalance = &(lps->lvshared->cost_balance);
+ VacuumActiveNWorkers = &(lps->lvshared->active_nworkers);
+
+ /*
+ * Set up shared cost balance and the number of active workers for
+ * vacuum delay.
+ */
+ pg_atomic_write_u32(VacuumSharedCostBalance, VacuumCostBalance);
+ pg_atomic_write_u32(VacuumActiveNWorkers, 0);
+
+ /*
+ * The number of workers can vary between bulkdelete and cleanup
+ * phase.
+ */
+ ReinitializeParallelWorkers(lps->pcxt, nworkers);
+
+ LaunchParallelWorkers(lps->pcxt);
+
+ if (lps->pcxt->nworkers_launched > 0)
+ {
+ /*
+ * Reset the local cost values for leader backend as we have
+ * already accumulated the remaining balance of heap.
+ */
+ VacuumCostBalance = 0;
+ VacuumCostBalanceLocal = 0;
+ }
+ else
+ {
+ /*
+ * Disable shared cost balance if we are not able to launch
+ * workers.
+ */
+ VacuumSharedCostBalance = NULL;
+ VacuumActiveNWorkers = NULL;
+ }
+

I don't like the idea of first initializing the
VacuumSharedCostBalance with lps->lvshared->cost_balance and then
uninitialize if nworkers_launched is 0.
I am not sure why do we need to initialize VacuumSharedCostBalance
here? just to perform pg_atomic_write_u32(VacuumSharedCostBalance,
VacuumCostBalance);?
I think we can initialize it only if nworkers_launched > 0 then we can
get rid of the else branch completely.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-01-17 04:13:56 Re: [HACKERS] Block level parallel vacuum
Previous Message Asim R P 2020-01-17 04:04:05 Unnecessary delay in streaming replication due to replay lag