Re: Performance of CLUSTER

From: Mark Thornton <mthornton(at)optrak(dot)com>
To: sthomas(at)optionshouse(dot)com, pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance of CLUSTER
Date: 2012-06-11 14:02:23
Message-ID: 4FD5FA6F.3080903@optrak.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 11/06/12 14:52, Shaun Thomas wrote:
> On 06/11/2012 08:46 AM, Mark Thornton wrote:
>
>> 500m --- though isn't clear if cluster uses maintenance memory or the
>> regular work memory. I could readily use a higher value for
>> maintenance_work_mem.
>
> For an operation like that, having a full GB wouldn't hurt. Though if
> you haven't already, you might think about pointing
I didn't think the process was using even the 500m it ought to have had
available, whereas creating an index did appear to use that much. Note
though that I didn't stay up all night watching it!

> your pgsql_tmp to /dev/shm for a while, even for just this operation.
>
> Then again, is your CPU at 100% during the entire operation?
No the CPU utilization is quite low. Most of the time is waiting for IO.

> If it's not fetching anything from disk or writing out much, reducing
> IO won't help. :) One deficiency we've had with CLUSTER is that it's a
> synchronous operation. It does each thing one after the other. So
> it'll organize the table contents, then it'll reindex each index
> (including the primary key) one after the other. If you have a lot of
> those, that can take a while, especially if any composite or complex
> indexes exist.
In this case there are only two indexes, the gist one and a primary key
(on a bigint value).

>
> You might actually be better off running parallel REINDEX commands on
> the table (I wrote a script for this because we have a 250M row table
> that each index takes 1-2.5 hours to build). You might also consider
> pg_reorg, which seems to handle some parts of a table rebuild a little
> better.
>
> That should give you an escalation pattern, though. :)

Thanks for your help,
Mark Thornton

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Shaun Thomas 2012-06-11 14:14:52 Re: Performance of CLUSTER
Previous Message Fitch, Britt 2012-06-11 13:55:29 postgres clustering interactions with pg_dump