Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2014-08-04 06:11:57
Message-ID: 4205E661176A124FAF891E0A6BA9135266347F25@szxeml509-mbs.china.huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 31 July 2014 10:59, Amit kapila Wrote,

Thanks for the review and valuable comments.
I have fixed all the comments and attached the revised patch.

As per your suggestion I have taken the performance report also…

Test1:

Machine Configuration:

Core : 8 (Intel(R) Xeon(R) CPU E5520 @ 2.27GHz)

RAM: 48GB

Test Scenario:

8 tables all with 1M+ records. [many records are deleted and inserted using some pattern, (files is attached in the mail)]

Test Result

Base Code: 43.126s

Parallel Vacuum Code

2 Threads : 29.687s

8 Threads : 14.647s

Test2: (as per your scenario, where actual vacuum time is very less)
Vacuum done for complete DB
8 tables all with 10000 records and few dead tuples

Test Result

Base Code: 0.59s

Parallel Vacuum Code

2 Threads : 0.50s

4 Threads : 0.29s

8 Threads : 0.18s

Regards,
Dilip Kumar

From: Amit Kapila [mailto:amit(dot)kapila16(at)gmail(dot)com]
Sent: 31 July 2014 10:59
To: Dilip kumar
Cc: Magnus Hagander; Alvaro Herrera; Jan Lentfer; Tom Lane; PostgreSQL-development; Sawada Masahiko; Euler Taveira
Subject: Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

On Fri, Jul 18, 2014 at 10:22 AM, Dilip kumar <dilip(dot)kumar(at)huawei(dot)com<mailto:dilip(dot)kumar(at)huawei(dot)com>> wrote:
> On 16 July 2014 12:13, Magnus Hagander Wrote,
> >Yeah, those are exactly my points. I think it would be significantly simpler to do it that way, rather than forking and threading. And also easier to make portable...
>
> >(and as a optimization on Alvaros suggestion, you can of course reuse the initial connection as one of the workers as long as you got the full list of tasks from it up front, which I think you do anyway in order to sorting of tasks...)
>
> I have modified the patch as per the suggestion,
>
> Now in beginning we create all connections, and first connection we use for getting table list in beginning, After that all connections will be involved in vacuum task.
>
> Please have a look and provide your opinion…

1.
+ connSlot = (ParallelSlot*)pg_malloc(parallel * sizeof(ParallelSlot));
+ for (i = 0; i < parallel; i++)
+ {
+ connSlot[i].connection = connectDatabase(dbname, host, port, username,
+ prompt_password, progname, false);
+
+ PQsetnonblocking(connSlot[i].connection, 1);
+ connSlot[i].isFree = true;
+ connSlot[i].sock = PQsocket(connSlot[i].connection);
+ }
Here it seems to me that you are opening connections before
getting or checking tables list, so in case you have lesser
number of tables, won't the extra connections be always idle.
Simple case to erify the same is with below example

vacuumdb -t t1 -d postgres -j 4

2.
+ res = executeQuery(conn,
+ "select relname, nspname from pg_class c, pg_namespace ns"
+ " where relkind= \'r\' and c.relnamespace = ns.oid"
+ " order by relpages desc",
+ progname, echo);

Here it is just trying to get the list of relations, however
Vacuum command processes materialized views as well, so
I think here the list should include materialized views as well
unless you have any specific reason for not including those.

3. In function vacuum_parallel(), if user has not provided list of tables,
then it is retrieving all the tables in database and then in run_parallel_vacuum(),
it tries to do Vacuum for each of table using Async mechanism, now
consider a case when after getting list if any table is dropped by user
from some other session, then patch will error out. However without patch
or Vacuum command will internally ignore such a case and complete
the Vacuum for other tables. Don't you think the patch should maintain
the existing behaviour?

4.
+ <term><option>-j <replaceable class="parameter">jobs</replaceable></></term>
+ Number of parallel process to perform the operation.

Change this description as per new implementation. Also I think
there is a need of some explanation for this new option.

5.
It seems there is no change in below function decalration:
static void vacuum_one_database(const char *dbname, bool full, bool verbose,
! bool and_analyze, bool analyze_only, bool analyze_in_stages,
! bool freeze, const char *table, const char *host,
! const char *port, const char *username,
! enum trivalue prompt_password,
const char *progname, bool echo);

6.
+ printf(_(" -j, --jobs=NUM use this many parallel jobs to vacuum\n"));
Change the description as per new implementation.

7.
/* This will give the free connection slot, if no slot is free it will
wait for atleast one slot to get free.*/

Multiline comments should be written like (refer other places)
/*
* This will give the free connection slot, if no slot is free it will
* wait for atleast one slot to get free.
*/
Kindly correct at other places if similar instance exist in patch.

8.
Isn't it a good idea to check performance of this new patch
especially for some worst cases like when there is not much
to vacuum in the tables inside a database. The reason I wanted
to check is that because with new algorithm (for a vacuum of database,
now it will get the list of tables and perform vacuum on individual
tables) we have to repeat certain actions in server side like
allocation/deallocataion of context, sending stats which would have
been otherwise done once.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
testcase.sql application/octet-stream 2.9 KB
vacuumdb_parallel_v12.patch application/octet-stream 20.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2014-08-04 07:52:30 Re: Use unique index for longer pathkeys.
Previous Message Rainer Tammer 2014-08-04 05:56:40 Re: Fwd: Re: Compile fails on AIX 6.1