Re: [HACKERS] Block level parallel vacuum

From: Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Sergei Kornilov <sk(at)zsrv(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <langote_amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2020-01-15 03:34:29
Message-ID: CAKYtNAoKiOxDV959deWonL1=8TtpwRsZVuCXwuQbeCwW-rETxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 14 Jan 2020 at 17:16, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
>
> On Tue, 14 Jan 2020 at 16:17, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
> >
> > On Tue, 14 Jan 2020 at 10:06, Masahiko Sawada
> > <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > >
> > > On Tue, 14 Jan 2020 at 03:20, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
> > > >
> > > > On Fri, 10 Jan 2020 at 15:51, Sergei Kornilov <sk(at)zsrv(dot)org> wrote:
> > > > >
> > > > > Hi
> > > > > Thank you for update! I looked again
> > > > >
> > > > > (vacuum_indexes_leader)
> > > > > + /* Skip the indexes that can be processed by parallel workers */
> > > > > + if (!skip_index)
> > > > > + continue;
> > > > >
> > > > > Does the variable name skip_index not confuse here? Maybe rename to something like can_parallel?
> > > > >
> > > >
> > > > Again I looked into code and thought that somehow if we can add a
> > > > boolean flag(can_parallel) in IndexBulkDeleteResult structure to
> > > > identify that this index is supporting parallel vacuum or not, then it
> > > > will be easy to skip those indexes and multiple time we will not call
> > > > skip_parallel_vacuum_index (from vacuum_indexes_leader and
> > > > parallel_vacuum_index)
> > > > We can have a linked list of non-parallel supported indexes, then
> > > > directly we can pass to vacuum_indexes_leader.
> > > >
> > > > Ex: let suppose we have 5 indexes into a table. If before launching
> > > > parallel workers, if we can add boolean flag(can_parallel)
> > > > IndexBulkDeleteResult structure to identify that this index is
> > > > supporting parallel vacuum or not.
> > > > Let index 1, 4 are not supporting parallel vacuum so we already have
> > > > info in a linked list that 1->4 are not supporting parallel vacuum, so
> > > > parallel_vacuum_index will process these indexes and rest will be
> > > > processed by parallel workers. If parallel worker found that
> > > > can_parallel is false, then it will skip that index.
> > > >
> > > > As per my understanding, if we implement this, then we can avoid
> > > > multiple function calling of skip_parallel_vacuum_index and if there
> > > > is no index which can't performe parallel vacuum, then we will not
> > > > call vacuum_indexes_leader as head of list pointing to null. (we can
> > > > save unnecessary calling of vacuum_indexes_leader)
> > > >
> > > > Thoughts?
> > > >
> > >
> > > We skip not only indexes that don't support parallel index vacuum but
> > > also indexes supporting it depending on vacuum phase. That is, we
> > > could skip different indexes at different vacuum phase. Therefore with
> > > your idea, we would need to have at least three linked lists for each
> > > possible vacuum phase(bulkdelete, conditional cleanup and cleanup), is
> > > that right?
> > >
> > > I think we can check if there are indexes that should be processed by
> > > the leader process before entering the loop in vacuum_indexes_leader
> > > by comparing nindexes_parallel_XXX of LVParallelState to the number of
> > > indexes but I'm not sure it's effective since the number of indexes on
> > > a table should be small.
> > >
> >
> > Hi,
> >
> > + /*
> > + * Try to initialize the parallel vacuum if requested
> > + */
> > + if (params->nworkers >= 0 && vacrelstats->useindex)
> > + {
> > + /*
> > + * Since parallel workers cannot access data in temporary tables, we
> > + * can't perform parallel vacuum on them.
> > + */
> > + if (RelationUsesLocalBuffers(onerel))
> > + {
> > + /*
> > + * Give warning only if the user explicitly tries to perform a
> > + * parallel vacuum on the temporary table.
> > + */
> > + if (params->nworkers > 0)
> > + ereport(WARNING,
> > + (errmsg("disabling parallel option of vacuum
> > on \"%s\" --- cannot vacuum temporary tables in parallel",
> >
> > From v45 patch, we moved warning of temporary table into
> > "params->nworkers >= 0 && vacrelstats->useindex)" check so if table
> > don't have any index, then we are not giving any warning. I think, we
> > should give warning for all the temporary tables if parallel degree is
> > given. (Till v44 patch, we were giving warning for all the temporary
> > tables(having index and without index))
> >
> > Thoughts?
>
> Hi,
> I did some more review. Below is the 1 review comment for v46-0002.
>
> + /*
> + * Initialize the state for parallel vacuum
> + */
> + if (params->nworkers >= 0 && vacrelstats->useindex)
> + {
> + /*
> + * Since parallel workers cannot access data in temporary tables, we
> + * can't perform parallel vacuum on them.
> + */
> + if (RelationUsesLocalBuffers(onerel)
>
> In above check, we should add "nindexes > 1" check so that if there is only 1 index, then we will not call begin_parallel_vacuum.

I think, " if (params->nworkers >= 0 && nindexes > 1)" check will be
enough here .

Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-01-15 03:45:46 Re: Complete data erasure
Previous Message Michael Paquier 2020-01-15 02:28:05 Re: Improve errors when setting incorrect bounds for SSL protocols