Re: [HACKERS] Block level parallel vacuum

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-10-29 04:47:51
Message-ID: CAFiTN-vROAtHfE2A0hXifgz81muTd3wREcM+JHhwpy9AsthiZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 29, 2019 at 10:01 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Oct 28, 2019 at 6:08 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Fri, Oct 25, 2019 at 2:06 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > >
> > > > On Fri, Oct 25, 2019 at 10:22 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > >
> > > > > For more detail of my idea it is that the first worker who entered to
> > > > > vacuum_delay_point adds its local value to shared value and reset the
> > > > > local value to 0. And then the worker sleeps if it exceeds
> > > > > VacuumCostLimit but before sleeping it can subtract VacuumCostLimit
> > > > > from the shared value. Since vacuum_delay_point are typically called
> > > > > per page processed I expect there will not such problem. Thoughts?
> > > >
> > > > Oh right, I assumed that when the local balance is exceeding the
> > > > VacuumCostLimit that time you are adding it to the shared value but
> > > > you are adding it to to shared value every time in vacuum_delay_point.
> > > > So I think your idea is correct.
> > >
> > > I've attached the updated patch set.
> > >
> > > First three patches add new variables and a callback to index AM.
> > >
> > > Next two patches are the main part to support parallel vacuum. I've
> > > incorporated all review comments I got so far. The memory layout of
> > > variable-length index statistics might be complex a bit. It's similar
> > > to the format of heap tuple header, having a null bitmap. And both the
> > > size of index statistics and actual data for each indexes follows.
> > >
> > > Last patch is a PoC patch that implements the shared vacuum cost
> > > balance. For now it's separated but after testing both approaches it
> > > will be merged to 0004 patch. I'll test both next week.
> > >
> > > This patch set can be applied on top of the patch[1] that improves
> > > gist index bulk-deletion. So canparallelvacuum of gist index is true.
> > >
> >
> > + /* Get the space for IndexBulkDeleteResult */
> > + bulkdelete_res = GetIndexBulkDeleteResult(shared_indstats);
> > +
> > + /*
> > + * Update the pointer to the corresponding bulk-deletion result
> > + * if someone has already updated it.
> > + */
> > + if (shared_indstats->updated && stats[idx] == NULL)
> > + stats[idx] = bulkdelete_res;
> > +
> >
> > I have a doubt in this hunk, I do not understand when this condition
> > will be hit? Because whenever we are setting shared_indstats->updated
> > to true at the same time we are setting stats[idx] to shared stat. So
> > I am not sure in what case the shared_indstats->updated will be true
> > but stats[idx] is still pointing to NULL?
> >
>
> I think it can be true in the case where one parallel vacuum worker
> vacuums the index that was vacuumed by other workers in previous index
> vacuum cycle. Suppose that worker-A and worker-B vacuumed index-A and
> index-B respectively. After that worker-A vacuum index-B in the next
> index vacuum cycle. In this case, shared_indstats->updated is true
> because worker-B already vacuumed in the previous vacuum cycle. On the
> other hand stats[idx] on worker-A is NULL because it's first time for
> worker-A to vacuum index-B. Therefore worker-A updates its stats[idx]
> to the bulk-deletion result on DSM in order to pass it to the index
> AM.
Okay, that makes sense.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2019-10-29 04:57:19 Re: [BUG] standby node can not provide service even it replays all log files
Previous Message Masahiko Sawada 2019-10-29 04:31:20 Re: [HACKERS] Block level parallel vacuum