From: | Josh Kupershmidt <schmiddy(at)gmail(dot)com> |
---|---|
To: | Leonardo Francalanci <m_lists(at)yahoo(dot)it> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: I: About "Our CLUSTER implementation is pessimal" patch |
Date: | 2010-10-02 00:36:36 |
Message-ID: | AANLkTin1gwabNHp7ViNNMoKM+Utr53WTJH7g=2HvYTWL@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Oct 1, 2010 at 4:33 AM, Leonardo Francalanci <m_lists(at)yahoo(dot)it> wrote:
>> I ran a few more performance tests on this patch. Here's what I got
>> for the tests Leonardo posted originally:
>> * 2M rows: 22 seconds for seq. scan, 24 seconds for index scan
>> * 5M rows: 139 seconds for seq. scan, 97 seconds for index scan
>> * 10M rows: 256 seconds seq. scan, 611 seconds for index scan
>
> I don't have time right now to run more tests, I'll try to make some by
> next week.
>
> Would it mean that doing:
>
> create table p as select * from atable order by akey
>
> (where akey is random distributed)
> with 5M rows is faster with enable_seqscan=0 and
> enable_indexscan=1??? That would be weird, especially on a
> laptop hard drive! (assuming there's a reasonable amount of
> memory set in work_mem/maintenance_work_mem)
Hrm, this is interesting. I set up a test table with 5M rows like so:
CREATE TABLE atable (
akey int
);
INSERT INTO atable (akey)
SELECT (RANDOM() * 100000)::int FROM generate_series(1,5000000);
CREATE INDEX akey_idx ON atable(akey);
ANALYZE atable;
And then I tested table creation times. First, using a normal:
BEGIN;
SET enable_seqscan = on;
SET enable_indexscan = on;
EXPLAIN ANALYZE CREATE TABLE idxscanned AS SELECT * FROM atable
ORDER BY akey;
ROLLBACK;
and I get:
Index Scan using akey_idx on atable
(cost=0.00..218347.89 rows=5000000 width=4)
(actual time=0.058..23612.020 rows=5000000 loops=1)
Total runtime: 33029.884 ms
Then, I tried forcing a sequential scan by changing "SET
enable_indexscan = off;", and it's significantly faster, as I would
expect:
Sort (cost=696823.42..709323.42 rows=5000000 width=4)
(actual time=8664.699..13533.131 rows=5000000 loops=1)
Sort Key: akey
Sort Method: external merge Disk: 68304kB
-> Seq Scan on atable (cost=0.00..72124.00 rows=5000000 width=4)
(actual time=0.012..838.092 rows=5000000 loops=1)
Total runtime: 21015.501 ms
I've ran both of these several times, and get 30-32 seconds for the
index scan and 20-21 seconds for the seq. scan each time.
My seq_page_cost and random_page_cost were left at the defaults for
these tests. Oddly, I tried turning seq_page_cost down to 0.01 and
EXPLAIN ANALYZE told me that an index scan was still being chosen. Is
there maybe some other setting I'm forgetting?
Josh
From | Date | Subject | |
---|---|---|---|
Next Message | David E. Wheeler | 2010-10-02 02:57:50 | Re: array_agg() NULL Handling |
Previous Message | Jaime Casanova | 2010-10-01 22:52:50 | Re: recovery.conf location |