Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-01-18 12:41:42
Message-ID: CAD21AoC96XPEMW0eEcBX9Gkw1FPPWbHii68hoqDaCwB8-rpfZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Fri, Jan 18, 2019 at 10:38 AM Haribabu Kommi
<kommi(dot)haribabu(at)gmail(dot)com> wrote:
>
>
> On Tue, Jan 15, 2019 at 6:00 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>
>>
>> Rebased.
>
>
> I started reviewing the patch, I didn't finish my review yet.
> Following are some of the comments.

Thank you for reviewing the patch.

>
> + <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
> + <listitem>
> + <para>
> + Execute index vacuum and cleanup index in parallel with
>
> I doubt that user can understand the terms index vacuum and cleanup index.
> May be it needs some more detailed information.
>

Agreed. Table 27.22 "Vacuum phases" has a good description of vacuum
phases. So maybe adding the referencint to it would work.

>
> - VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7 /* don't skip any pages */
> + VACOPT_PARALLEL = 1 << 7, /* do lazy VACUUM in parallel */
> + VACOPT_DISABLE_PAGE_SKIPPING = 1 << 8 /* don't skip any pages */
> +} VacuumOptionFlag;
>
> Any specific reason behind not adding it as last member of the enum?
>

My mistake, fixed it.

>
> -typedef enum VacuumOption
> +typedef enum VacuumOptionFlag
> {
>
> I don't find the new name quite good, how about VacuumFlags?
>

Agreed with removing "Option" from the name but I think VacuumFlag
would be better because this enum represents only one flag. Thoughts?

>
> +typedef struct VacuumOption
> +{
>
> How about VacuumOptions? Because this structure can contains all the
> options provided to vacuum operation.
>

Agreed.

>
>
> + vacopt1->flags |= vacopt2->flags;
> + if (vacopt2->flags == VACOPT_PARALLEL)
> + vacopt1->nworkers = vacopt2->nworkers;
> + pfree(vacopt2);
> + $$ = vacopt1;
> + }
>
> As the above statement indicates the the last parallel number of workers
> is considered into the account, can we explain it in docs?
>

Agreed.

>
> postgres=# vacuum (parallel 2, verbose) tbl;
>
> With verbose, no parallel workers related information is available.
> I feel giving that information is required even when it is not parallel
> vacuum also.
>

Agreed. How about the folloiwng verbose output? I've added the number
of launched, planned and requested vacuum workers and purpose (vacuum
or cleanup).

postgres(1:91536)=# vacuum (verbose, parallel 30) test; -- table
'test' has 3 indexes
INFO: vacuuming "public.test"
INFO: launched 2 parallel vacuum workers for index vacuum (planned:
2, requested: 30)
INFO: scanned index "test_idx1" to remove 2000 row versions
DETAIL: CPU: user: 0.12 s, system: 0.00 s, elapsed: 0.12 s
INFO: scanned index "test_idx2" to remove 2000 row versions by
parallel vacuum worker
DETAIL: CPU: user: 0.07 s, system: 0.05 s, elapsed: 0.12 s
INFO: scanned index "test_idx3" to remove 2000 row versions by
parallel vacuum worker
DETAIL: CPU: user: 0.09 s, system: 0.05 s, elapsed: 0.14 s
INFO: "test": removed 2000 row versions in 10 pages
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO: launched 2 parallel vacuum workers for index cleanup (planned:
2, requested: 30)
INFO: index "test_idx1" now contains 991151 row versions in 2745 pages
DETAIL: 2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: index "test_idx2" now contains 991151 row versions in 2745 pages
DETAIL: 2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: index "test_idx3" now contains 991151 row versions in 2745 pages
DETAIL: 2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "test": found 2000 removable, 367 nonremovable row versions in
41 out of 4425 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 500
There were 6849 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.12 s, system: 0.01 s, elapsed: 0.17 s.
VACUUM

Since the previous patch conflicts with 285d8e12 I've attached the
latest version patch that incorporated the review comment I got.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
v12-0002-Add-P-option-to-vacuumdb-command.patch text/x-patch 6.6 KB
v12-0001-Add-parallel-option-to-VACUUM-command.patch text/x-patch 76.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2019-01-18 12:55:28 Re: Query with high planning time at version 11.1 compared versions 10.5 and 11.0
Previous Message Surafel Temesgen 2019-01-18 12:01:46 Re: pg_dump multi VALUES INSERT