Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2018-10-31 00:23:18
Message-ID: CAD21AoAEDGrUKf2M3MinrF5juWkXgwmEjmZFHE_L=nv-fovKLA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 30, 2018 at 5:30 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Aug 14, 2018 at 9:31 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Thu, Nov 30, 2017 at 11:09 AM, Michael Paquier
> > <michael(dot)paquier(at)gmail(dot)com> wrote:
> > > On Tue, Oct 24, 2017 at 5:54 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >> Yeah, I was thinking the commit is relevant with this issue but as
> > >> Amit mentioned this error is emitted by DROP SCHEMA CASCASE.
> > >> I don't find out the cause of this issue yet. With the previous
> > >> version patch, autovacuum workers were woking with one parallel worker
> > >> but it never drops relations. So it's possible that the error might
> > >> not have been relevant with the patch but anywayI'll continue to work
> > >> on that.
> > >
> > > This depends on the extension lock patch from
> > > https://www.postgresql.org/message-id/flat/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA(at)mail(dot)gmail(dot)com/
> > > if I am following correctly. So I propose to mark this patch as
> > > returned with feedback for now, and come back to it once the root
> > > problems are addressed. Feel free to correct me if you think that's
> > > not adapted.
> >
> > I've re-designed the parallel vacuum patch. Attached the latest
> > version patch. As the discussion so far, this patch depends on the
> > extension lock patch[1]. However I think we can discuss the design
> > part of parallel vacuum independently from that patch. That's way I'm
> > proposing the new patch. In this patch, I structured and refined the
> > lazy_scan_heap() because it's a single big function and not suitable
> > for making it parallel.
> >
> > The parallel vacuum worker processes keep waiting for commands from
> > the parallel vacuum leader process. Before entering each phase of lazy
> > vacuum such as scanning heap, vacuum index and vacuum heap, the leader
> > process changes the all workers state to the next state. Vacuum worker
> > processes do the job according to the their state and wait for the
> > next command after finished. Also in before entering the next phase,
> > the leader process does some preparation works while vacuum workers is
> > sleeping; for example, clearing shared dead tuple space before
> > entering the 'scanning heap' phase. The status of vacuum workers are
> > stored into a DSM area pointed by WorkerState variables, and
> > controlled by the leader process. FOr the basic design and performance
> > improvements please refer to my presentation at PGCon 2018[2].
> >
> > The number of parallel vacuum workers is determined according to
> > either the table size or PARALLEL option in VACUUM command. The
> > maximum of parallel workers is max_parallel_maintenance_workers.
> >
> > I've separated the code for vacuum worker process to
> > backends/commands/vacuumworker.c, and created
> > includes/commands/vacuum_internal.h file to declare the definitions
> > for the lazy vacuum.
> >
> > For autovacuum, this patch allows autovacuum worker process to use the
> > parallel option according to the relation size or the reloption. But
> > autovacuum delay, since there is no slots for parallel worker of
> > autovacuum in AutoVacuumShmem this patch doesn't support the change of
> > the autovacuum delay configuration during running.
> >
>
> Attached rebased version patch to the current HEAD.
>
> > Please apply this patch with the extension lock patch[1] when testing
> > as this patch can try to extend visibility map pages concurrently.
> >
>
> Because the patch leads performance degradation in the case where
> bulk-loading to a partitioned table I think that the original
> proposal, which makes group locking conflict when relation extension
> locks, is more realistic approach. So I worked on this with the simple
> patch instead of [1]. Attached three patches:
>
> * 0001 patch publishes some static functions such as
> heap_paralellscan_startblock_init so that the parallel vacuum code can
> use them.
> * 0002 patch makes the group locking conflict when relation extension locks.
> * 0003 patch add paralel option to lazy vacuum.
>
> Please review them.
>

Oops, forgot to attach patches.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
v8-0001-Publish-some-parallel-heap-scan-functions.patch application/x-patch 2.2 KB
v8-0002-Make-group-locking-conflict-when-relation-exntesi.patch application/x-patch 1009 bytes
v8-0003-Add-parallel-option-to-lazy-vacuum.patch application/x-patch 145.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-10-31 01:23:16 Re: Super PathKeys (Allowing sort order through precision loss functions)
Previous Message Tomas Vondra 2018-10-31 00:08:58 Re: [HACKERS] logical decoding of two-phase transactions