Re: I: About "Our CLUSTER implementation is pessimal" patch

From: Josh Kupershmidt <schmiddy(at)gmail(dot)com>
To: Leonardo Francalanci <m_lists(at)yahoo(dot)it>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: I: About "Our CLUSTER implementation is pessimal" patch
Date: 2010-10-05 02:21:42
Message-ID: AANLkTinLuuBwDEf9XfTmBw-sj9j4pDnSiSyH2bAkjg14@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 4, 2010 at 4:47 PM, Leonardo Francalanci <m_lists(at)yahoo(dot)it> wrote:
>> It sounds like the costing model might need a bit more work before  we commit
>>this.
>
>
> I tried again the simple sql tests I posted a while ago, and I still get the
> same ratios.
> I've tested the applied patch on a dual opteron + disk array Solaris machine.
>
> I really don't get how a laptop hard drive can be faster at reading data using
> random
> seeks (required by the original cluster method) than seq scan + sort for the 5M
> rows
> test case.
> Same thing for the "cluster vs bloat" test: the seq scan + sort is faster on my
> machine.

Well, my last tests showed that the planner was choosing an index scan
for queries like:

SELECT * FROM atable ORDER BY akey;

but forcing a seqscan + sort made this faster, as you expect. So I was
thinking my cost settings (posted upthread) probably need some
tweaking, unless it's a problem with the optimizer. But all of this is
unrelated to the patch.

[... pokes a bit more ...] Sigh, now I'm finding it impossible to
reproduce my own results, particulary the earlier cluster_vs_bloat.sql
test of:

* 10M rows: 84 seconds for seq. scan, 44 seconds for index scan

I'm getting about 5 seconds now for the cluster, both with and without
the patch. effective_cache_size doesn't seem to impact this much. I'll
have another look when I have some more time.

Josh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joseph Adams 2010-10-05 02:29:52 Re: Basic JSON support
Previous Message Robert Haas 2010-10-05 01:49:06 Re: [HACKERS] top-level DML under CTEs