Re: [HACKERS] Block level parallel vacuum

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-10-27 07:21:56
Message-ID: CAFiTN-uCXHVV4y-7h9xi36iAndZ9j8afYxQEhi+vbQ0w5BEYmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 25, 2019 at 9:19 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Fri, Oct 25, 2019 at 2:06 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Fri, Oct 25, 2019 at 10:22 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > For more detail of my idea it is that the first worker who entered to
> > > vacuum_delay_point adds its local value to shared value and reset the
> > > local value to 0. And then the worker sleeps if it exceeds
> > > VacuumCostLimit but before sleeping it can subtract VacuumCostLimit
> > > from the shared value. Since vacuum_delay_point are typically called
> > > per page processed I expect there will not such problem. Thoughts?
> >
> > Oh right, I assumed that when the local balance is exceeding the
> > VacuumCostLimit that time you are adding it to the shared value but
> > you are adding it to to shared value every time in vacuum_delay_point.
> > So I think your idea is correct.
>
> I've attached the updated patch set.
>
> First three patches add new variables and a callback to index AM.
>
> Next two patches are the main part to support parallel vacuum. I've
> incorporated all review comments I got so far. The memory layout of
> variable-length index statistics might be complex a bit. It's similar
> to the format of heap tuple header, having a null bitmap. And both the
> size of index statistics and actual data for each indexes follows.
>
> Last patch is a PoC patch that implements the shared vacuum cost
> balance. For now it's separated but after testing both approaches it
> will be merged to 0004 patch. I'll test both next week.
>
> This patch set can be applied on top of the patch[1] that improves
> gist index bulk-deletion. So canparallelvacuum of gist index is true.
>
> [1] https://www.postgresql.org/message-id/CAFiTN-uQY%2BB%2BCLb8W3YYdb7XmB9hyYFXkAy3C7RY%3D-YSWRV1DA%40mail.gmail.com
>
I haven't yet read the new set of the patch. But, I have noticed one
thing. That we are getting the size of the statistics using the AM
routine. But, we are copying those statistics from local memory to
the shared memory directly using the memcpy. Wouldn't it be a good
idea to have an AM specific routine to get it copied from the local
memory to the shared memory? I am not sure it is worth it or not but
my thought behind this point is that it will give AM to have local
stats in any form ( like they can store a pointer in that ) but they
can serialize that while copying to shared stats. And, later when
shared stats are passed back to the Am then it can deserialize in its
local form and use it.

+ * Since all vacuum workers write the bulk-deletion result at
+ * different slots we can write them without locking.
+ */
+ if (!shared_indstats->updated && stats[idx] != NULL)
+ {
+ memcpy(bulkdelete_res, stats[idx], shared_indstats->size);
+ shared_indstats->updated = true;
+
+ /*
+ * no longer need the locally allocated result and now
+ * stats[idx] points to the DSM segment.
+ */
+ pfree(stats[idx]);
+ stats[idx] = bulkdelete_res;
+ }

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Smith, Peter 2019-10-27 11:44:54 RE: Proposal: Add more compile-time asserts to expose inconsistencies.
Previous Message Dilip Kumar 2019-10-27 06:55:07 Re: Fix of fake unlogged LSN initialization