Re: regarding CLUSTER and HUGE work_mem / maintenance_work_mem

From: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
To:
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: regarding CLUSTER and HUGE work_mem / maintenance_work_mem
Date: 2012-01-28 02:34:16
Message-ID: CAKuK5J1LCnJb8qeL9jUOGeWGn1epdWiNyyTSK3NKXSFuBOHS_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Jan 27, 2012 at 12:05 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 27.01.2012 19:43, Jon Nelson wrote:
>>
>> Let's say I have a 7GB table with 3-4 indices for a total of 10-12GB.
>> Furthermore, let's say I have a machine with sufficient memory for me
>> to set the work_mem  and maintenance_work_mem to 20GB (just for this
>> session).
>> When I issue a CLUSTER using one of the indices, I see PostgreSQL (by
>> way of strace) performing an index scan which amounts to large
>> quantities of random I/O.
>> In my case, that means it takes a very, very long time. PostgreSQL is
>> largely at defaults, except for a 2GB shared_buffers and a few
>> unrelated changes. The system itself has 32GB of physical RAM and has
>> plenty free.
>> Why didn't PostgreSQL just read the table into memory (and the
>> interesting index) as a sequential scan, sort, and then write it out?
>> It seems like there would be more than enough memory for that. The
>> sequential I/O rate on this machine is 50-100x the random I/O rate.
>>
>> I'm using 8.4.10 (with the 'inet' de-toasting patch) on Scientific Linux
>> 6.1.
>
>
> The suppport for doing a seqscan+sort in CLUSTER was introduced in version
> 9.1. Before that, CLUSTER always did an indexscan. See release notes:
> http://www.postgresql.org/docs/9.1/static/release-9-1.html#AEN107416

That's what I get for digging through the source (git) but working
with 8.4.10, on a Friday, at the end of a long week.
Thanks for pointing that out to somebody that should have known better.

--
Jon

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2012-01-28 03:04:00 Re: regarding CLUSTER and HUGE work_mem / maintenance_work_mem
Previous Message Claudio Freire 2012-01-28 02:23:45 Re: Postgress is taking lot of CPU on our embedded hardware.