Re: [HACKERS] Block level parallel vacuum

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-10-04 10:48:13
Message-ID: CAA4eK1JqKQMMP0+Zub29nk2S02eZRK-4tQ2=4fwkB7aK8MgYJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
wrote:

> On Sat, Sep 21, 2019 at 9:31 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> >
> > *
> > +end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes)
> > {
> > ..
> > + /* Shutdown worker processes and destroy the parallel context */
> > + WaitForParallelWorkersToFinish(lps->pcxt);
> > ..
> > }
> >
> > Do we really need to call WaitForParallelWorkersToFinish here as it
> > must have been called in lazy_parallel_vacuum_or_cleanup_indexes
> > before this time?
>
> No, removed.
>

+ /* Shutdown worker processes and destroy the parallel context */
+ DestroyParallelContext(lps->pcxt);

But you forget to update the comment.

Few more comments:
--------------------------------
1.
+/*
+ * Parallel Index vacuuming and index cleanup routine used by both the
leader
+ * process and worker processes. Unlike single process vacuum, we don't
update
+ * index statistics after cleanup index since it is not allowed during
+ * parallel mode, instead copy index bulk-deletion results from the local
+ * memory to the DSM segment and update them at the end of parallel lazy
+ * vacuum.
+ */
+static void
+do_parallel_vacuum_or_cleanup_indexes(Relation *Irel, int nindexes,
+ IndexBulkDeleteResult **stats,
+ LVShared *lvshared,
+ LVDeadTuples *dead_tuples)
+{
+ /* Loop until all indexes are vacuumed */
+ for (;;)
+ {
+ int idx;
+
+ /* Get an index number to process */
+ idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
+
+ /* Done for all indexes? */
+ if (idx >= nindexes)
+ break;
+
+ /*
+ * Update the pointer to the corresponding bulk-deletion result
+ * if someone has already updated it.
+ */
+ if (lvshared->indstats[idx].updated &&
+ stats[idx] == NULL)
+ stats[idx] = &(lvshared->indstats[idx].stats);
+
+ /* Do vacuum or cleanup one index */
+ if (!lvshared->for_cleanup)
+ lazy_vacuum_index(Irel[idx], &stats[idx], dead_tuples,
+ lvshared->reltuples);
+ else
+ lazy_cleanup_index(Irel[idx], &stats[idx], lvshared->reltuples,
+ lvshared->estimated_count);

It seems we always run index cleanup via parallel worker which seems
overkill because the cleanup index generally scans the index only when
bulkdelete was not performed. In some cases like for hash index, it
doesn't do anything even bulk delete is not called. OTOH, for brin index,
it does the main job during cleanup but we might be able to always allow
index cleanup by parallel worker for brin indexes if we remove the
allocation in brinbulkdelete which I am not sure is of any use.

I think we shouldn't call cleanup via parallel worker unless bulkdelete
hasn't been performed on the index.

2.
- for (i = 0; i < nindexes; i++)
- lazy_vacuum_index(Irel[i],
- &indstats[i],
- vacrelstats);
+ lazy_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+ indstats, lps, false);

Indentation is not proper. You might want to run pgindent.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Asif Rehman 2019-10-04 11:01:55 Re: WIP/PoC for parallel backup
Previous Message vignesh C 2019-10-04 10:33:23 Re: dropdb --force