Re: CLUSTER versus a dedicated table

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Robert James <srobertjames(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: CLUSTER versus a dedicated table
Date: 2011-06-02 01:33:32
Message-ID: BANLkTi=UN9gr2ufb9pmBgdvCgKDHMBvkUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Jun 1, 2011 at 7:54 PM, Robert James <srobertjames(at)gmail(dot)com> wrote:
> Hi.  I'm interested in understanding the differences between
> CLUSTERing a table and making a dedicated one.
>
> We have a table with about 1 million records.  On a given day, only
> about 1% of them are of interest.  That 1% changes every day (it's
> WHERE active_date = today), and so we index and cluster on it.
>
> Even so, the planner shows a very large cost for the Index Scan: about
> 3500.  If I instead do a SELECT INTO temp_table FROM big_table WHERE
> active_date = today, and then do SELECT * FROM temp_table, I get a
> planned cost of 65.  Yet, the actual time for both queries is almost
> identical.
>
> Questions:
> 1. Why is there such a discrepancy between the planner's estimate and
> the actual cost?
>
> 2. In a case like this, will I in general see a performance gain by
> doing a daily SELECT INTO and then querying from that table? My ad hoc
> test doesn't indicate I would (despite the planner's prediction), and
> I'd rather avoid this if it won't help.
>
> 3. In general, does CLUSTER provide all the performance benefits of a
> dedicated table? If it doesn't, what does it lack?

no. i suspect you may be over thinking the problem -- what led you to
want to cluster in the first place?

merlin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jarrod Chesney 2011-06-02 01:46:38 Re: Delete performance
Previous Message Merlin Moncure 2011-06-02 01:31:18 Re: Problem query