Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2013-11-13 01:36:35
Message-ID: CAB7nPqQH4XAksne0X47Fbd0YG0QApHFHLZ09tQp7WzkF+gCPpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 7, 2013 at 8:42 PM, Dilip kumar <dilip(dot)kumar(at)huawei(dot)com> wrote:
> This patch implementing the following TODO item
>
> Allow parallel cores to be used by vacuumdb
> http://www.postgresql.org/message-id/4F10A728.7090403@agliodbs.com
>
>
>
> Like Parallel pg_dump, vacuumdb is provided with the option to run the
> vacuum of multiple tables in parallel. [ vacuumdb –j ]
>
>
>
> 1. One new option is provided with vacuumdb to give the number of
> workers.
>
> 2. All worker will be started in beginning and all will be waiting for
> the vacuum instruction from the master.
>
> 3. Now, if table list is provided in vacuumdb command using –t then,
> it will send the vacuum of one table to one of the IDLE worker, next table
> to next IDLE worker and so on.
>
> 4. If vacuum is given for one DB then, it will execute select on
> pg_class to get the table list and fetch the table name one by one and also
> assign the vacuum responsibility to IDLE workers.
>
>
>
> Performance Data by parallel vacuumdb:
>
> Machine Configuration:
>
> Core : 8
>
> RAM: 24GB
>
> Test Scenario:
>
> 16 tables all with 4M records. [many records
> are deleted and inserted using some pattern, (files is attached in the
> mail)]
>
>
>
> Test Result
>
>
>
> {Base Code} Time(s) %CPU Usage Avg Read(kB/s) Avg Write(kB/s)
>
> 521 3% 12000
> 20000
>
>
>
>
>
> {With Parallel Vacuum Patch}
>
> worker Time(s) %CPU Usage Avg Read(kB/s) Avg
> Write(kB/s)
>
> 1 518 3% 12000
> 20000 --> this will take the same path as base code
>
> 2 390 5% 14000
> 30000
>
> 8 235 7% 18000
> 40000
>
> 16 197 8% 20000
> 50000
>
>
>
> Conclusion:
>
> By running the vacuumdb in parallel, CPU and I/O throughput
> is increasing and it can give >50% performance improvement.
>
>
>
> Work to be Done:
>
> 1. Documentations of the new command.
>
> 2. Parallel support for vacuum all db.
>
>
>
> Is it required to move the common code for parallel operation of pg_dump and
> vacuumdb to one place and reuse it ?
>
>
>
> Prototype patch is attached in the mail, please provide your
> feedback/Suggestions…
>
>
>
> Thanks & Regards,
>
> Dilip Kumar
>
>
>
>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2013-11-13 01:37:54 Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Previous Message David Johnston 2013-11-13 01:35:23 Re: MVCC snapshot timing