TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2013-11-07 11:42:59
Message-ID: 4205E661176A124FAF891E0A6BA9135265924388@SZXEML507-MBS.china.huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This patch implementing the following TODO item

Allow parallel cores to be used by vacuumdb
http://www.postgresql.org/message-id/4F10A728.7090403@agliodbs.com

Like Parallel pg_dump, vacuumdb is provided with the option to run the vacuum of multiple tables in parallel. [ vacuumdb -j ]

1. One new option is provided with vacuumdb to give the number of workers.

2. All worker will be started in beginning and all will be waiting for the vacuum instruction from the master.

3. Now, if table list is provided in vacuumdb command using -t then, it will send the vacuum of one table to one of the IDLE worker, next table to next IDLE worker and so on.

4. If vacuum is given for one DB then, it will execute select on pg_class to get the table list and fetch the table name one by one and also assign the vacuum responsibility to IDLE workers.

Performance Data by parallel vacuumdb:

Machine Configuration:

Core : 8

RAM: 24GB

Test Scenario:

16 tables all with 4M records. [many records are deleted and inserted using some pattern, (files is attached in the mail)]

Test Result

{Base Code} Time(s) %CPU Usage Avg Read(kB/s) Avg Write(kB/s)

521 3% 12000 20000

{With Parallel Vacuum Patch}

worker Time(s) %CPU Usage Avg Read(kB/s) Avg Write(kB/s)

1 518 3% 12000 20000 --> this will take the same path as base code

2 390 5% 14000 30000

8 235 7% 18000 40000

16 197 8% 20000 50000

Conclusion:

By running the vacuumdb in parallel, CPU and I/O throughput is increasing and it can give >50% performance improvement.

Work to be Done:

1. Documentations of the new command.

2. Parallel support for vacuum all db.

Is it required to move the common code for parallel operation of pg_dump and vacuumdb to one place and reuse it ?

Prototype patch is attached in the mail, please provide your feedback/Suggestions...

Thanks & Regards,
Dilip Kumar

Attachment Content-Type Size
TestCase.sql application/octet-stream 18.4 KB
vacuumdb_parallel_v1.patch application/octet-stream 35.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2013-11-07 13:47:05 Re: [v9.4] row level security
Previous Message Dean Rasheed 2013-11-07 10:11:38 Re: [v9.4] row level security