Re: Parallel Vacuum

From: Dimitri <dimitrik(dot)fr(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: Michael Stone <mstone+postgres(at)mathom(dot)us>
Subject: Re: Parallel Vacuum
Date: 2007-03-22 18:24:38
Message-ID: 200703221924.39278.dimitrik.fr@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Mike,

you're right until you're using a single disk :)
Now, imagine you have more disks - more I/O operations you may perform, and
you'll need also a CPU time to process them :) until you fully use one CPU
per 'vacuumdb' - and then you stop...

As well, even in case when CPU is not highly used by vacuumdb - single process
is still not able to get a max performance of the storage array, just because
you need several concurrent I/O running in the system to reach max
throughput. And even filesystem might help you here - it's not all... More
concurrent writers you have - higher performance you reach (until real
limit)...

In my case I have a small storage array capable to give you more than
500MB/sec and say 5000 op/s. All my data are striped throw all array disks.
Single 'vacuumdb' process here become more CPU-bound rather I/O as it cannot
fully load storage array... So, more vacuum processes I start in parallel -
faster I'll finish database vacuuming.

Best regards!
-Dimitri

On Thursday 22 March 2007 18:10, Michael Stone wrote:
> On Thu, Mar 22, 2007 at 04:55:02PM +0100, Dimitri wrote:
> >In my case I have several CPU on the server and quite powerful storage box
> >which is not really busy with a single vacuum. So, my idea is quite simple
> > - speed-up vacuum with parallel execution (just an algorithm):
>
> Vacuum is I/O intensive, not CPU intensive. Running more of them will
> probably make things slower rather than faster, unless each thing you're
> vacuuming has its own (separate) disks. The fact that your CPU isn't
> pegged while vacuuming suggests that your disk is already your
> bottleneck--and doing multiple sequential scans on the same disk will
> definitely be slower than doing one.
>
> Mike Stone
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2007-03-22 18:24:39 Re: Performance of count(*)
Previous Message Craig A. James 2007-03-22 18:21:21 Re: Performance of count(*)