Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-10-25 15:48:44
Message-ID: CAD21AoBMo9dr_QmhT=dKh7fmiq7tpx+yLHR8nw9i5NZ-SgtaVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 25, 2019 at 2:06 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Fri, Oct 25, 2019 at 10:22 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > For more detail of my idea it is that the first worker who entered to
> > vacuum_delay_point adds its local value to shared value and reset the
> > local value to 0. And then the worker sleeps if it exceeds
> > VacuumCostLimit but before sleeping it can subtract VacuumCostLimit
> > from the shared value. Since vacuum_delay_point are typically called
> > per page processed I expect there will not such problem. Thoughts?
>
> Oh right, I assumed that when the local balance is exceeding the
> VacuumCostLimit that time you are adding it to the shared value but
> you are adding it to to shared value every time in vacuum_delay_point.
> So I think your idea is correct.

I've attached the updated patch set.

First three patches add new variables and a callback to index AM.

Next two patches are the main part to support parallel vacuum. I've
incorporated all review comments I got so far. The memory layout of
variable-length index statistics might be complex a bit. It's similar
to the format of heap tuple header, having a null bitmap. And both the
size of index statistics and actual data for each indexes follows.

Last patch is a PoC patch that implements the shared vacuum cost
balance. For now it's separated but after testing both approaches it
will be merged to 0004 patch. I'll test both next week.

This patch set can be applied on top of the patch[1] that improves
gist index bulk-deletion. So canparallelvacuum of gist index is true.

[1] https://www.postgresql.org/message-id/CAFiTN-uQY%2BB%2BCLb8W3YYdb7XmB9hyYFXkAy3C7RY%3D-YSWRV1DA%40mail.gmail.com

Regards,

--
Masahiko Sawada

Attachment Content-Type Size
v31-0002-Add-an-index-AM-callback-to-estimate-DSM-for-par.patch text/x-patch 8.6 KB
v31-0003-Add-an-index-AM-field-to-check-if-use-maintenanc.patch text/x-patch 5.6 KB
v31-0005-Add-paralell-P-option-to-vacuumdb-command.patch text/x-patch 5.9 KB
v31-0004-Add-parallel-option-to-VACUUM-command.patch text/x-patch 60.6 KB
v31-0006-PoC-shared-vacuum-cost-balance.patch text/x-patch 6.1 KB
v31-0001-Add-an-index-AM-field-to-check-parallel-index-pa.patch text/x-patch 5.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2019-10-25 15:52:20 Re: Questions/Observations related to Gist vacuum
Previous Message rtorre 2019-10-25 15:17:18 [Proposal] Arbitrary queries in postgres_fdw