Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2014-12-01 06:48:41
Message-ID: 4205E661176A124FAF891E0A6BA9135266380268@szxeml509-mbs.china.huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 24 November 2014 11:29, Amit Kapila Wrote,

>I think still some of the comments given upthread are not handled:
>
>a. About cancel handling

Your Actual comment was -->

>One other related point is that I think still cancel handling mechanism
>is not completely right, code is doing that when there are not enough
>number of freeslots, but otherwise it won't handle the cancel request,
>basically I am referring to below part of code:

run_parallel_vacuum()
{
..
for (cell = tables->head; cell; cell = cell->next)
{
..
free_slot = GetIdleSlot(connSlot, max_slot, dbname, progname,
completedb);

PQsendQuery(slotconn, sql.data);

resetPQExpBuffer(&sql);
}

1. I think only if connection is going for Slot wait, it will be in blocking, or while GetQueryResult, so we have Handle SetCancleRequest both places.

2. Now a case (as you mentioned), when there are enough slots, and and above for loop is running if user do Ctrl+C then this will not break, This I have handled by checking inAbort

Mode inside the for loop before sending the new command, I think this we cannot do by setting the SetCancel because only when query receive some data it will realize that it canceled and it will fail, but until connection is not going to receive data it will not see the failure. So I have handled inAbort directly.

>b. Setting of inAbort flag for case where PQCancel is successful

Your Actual comment was -->

>Yeah, user has tried to terminate, however utility will emit the
>message: "Could not send cancel request" in such a case and still
>silently tries to cancel and disconnect all connections.

You are right, I have fixed the code, now in case of failure we need not to set inAbort Flag..

>c. Performance data of --analyze-in-stages switch

Performance Data
------------------------------
CPU 8 cores
RAM = 16GB
checkpoint_segments=256

Before each test, run the test.sql (attached)

Un-patched -
dilip(at)linux-ltr9:/home/dilip/9.4/install/bin> time ./vacuumdb -p 9005 --analyze-in-stages -d postgres
Generating minimal optimizer statistics (1 target)
Generating medium optimizer statistics (10 targets)
Generating default (full) optimizer statistics

real 0m0.843s
user 0m0.000s
sys 0m0.000s

Patched
dilip(at)linux-ltr9:/home/dilip/9.4/install/bin> time ./vacuumdb -p 9005 --analyze-in-stages -j 2 -d postgres
Generating minimal optimizer statistics (1 target)
Generating medium optimizer statistics (10 targets)
Generating default (full) optimizer statistics

real 0m0.593s
user 0m0.004s
sys 0m0.004s

dilip(at)linux-ltr9:/home/dilip/9.4/install/bin> time ./vacuumdb -p 9005 --analyze-in-stages -j 4 -d postgres
Generating minimal optimizer statistics (1 target)
Generating medium optimizer statistics (10 targets)
Generating default (full) optimizer statistics

real 0m0.421s
user 0m0.004s
sys 0m0.004s

I think in 2 connections we can get 30% improvement.

>d. Have one pass over the comments in patch. I could still some
>wrong multiline comments. Refer below:
>+ /* Otherwise, we got a stage from vacuum_all_databases(), so run
>+ * only that one. */

Checked all, and fixed..

While testing, I found one more different behavior compare to base code,

Base Code:
dilip(at)linux-ltr9:/home/dilip/9.4/install/bin> time ./vacuumdb -p 9005 -t "t1" -t "t2" -t "t3" -t "t4" --analyze-in-stages -d Postgres

Generating minimal optimizer statistics (1 target)
Generating medium optimizer statistics (10 targets)
Generating default (full) optimizer statistics
Generating minimal optimizer statistics (1 target)
Generating medium optimizer statistics (10 targets)
Generating default (full) optimizer statistics
Generating minimal optimizer statistics (1 target)
Generating medium optimizer statistics (10 targets)
Generating default (full) optimizer statistics
Generating minimal optimizer statistics (1 target)
Generating medium optimizer statistics (10 targets)
Generating default (full) optimizer statistics

real 0m0.605s
user 0m0.004s
sys 0m0.000s

I think it should be like,
SET default_statistics_target=1; do for all three tables
SET default_statistics_target=10; do for all three tables so on..

With Patched
dilip(at)linux-ltr9:/home/dilip/9.4/install/bin> time ./vacuumdb -p 9005 -t "t1" -t "t2" -t "t3" -t "t4" --analyze-in-stages -j 2 -d postgres
Generating minimal optimizer statistics (1 target)
Generating medium optimizer statistics (10 targets)
Generating default (full) optimizer statistics

real 0m0.395s
user 0m0.000s
sys 0m0.004s

here we are setting each target once and doing for all the tables..

Please provide you opinion.

Regards,
Dilip Kumar

Attachment Content-Type Size
test.sql application/octet-stream 514 bytes
vacuumdb_parallel_v19.patch application/octet-stream 27.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2014-12-01 06:55:22 [PROTOCOL TODO] Permit streaming of unknown-length lob/clob (bytea, text, etc)
Previous Message Jim Nasby 2014-12-01 02:46:51 Re: Proposal: Log inability to lock pages during vacuum