Re: Sort and index

From: Andrei Gaspar <andi(at)softnrg(dot)dnttm(dot)ro>
To: Michael Fuhr <mike(at)fuhr(dot)org>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Sort and index
Date: 2005-04-20 17:10:42
Message-ID: 42668D12.1060701@softnrg.dnttm.ro
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Michael Fuhr wrote:

>On Mon, Apr 18, 2005 at 10:44:43AM -0500, Dave Held wrote:
>
>
>>>I thought that an index can be used for sorting.
>>>I'm a little confused about the following result:
>>>
>>>create index OperationsName on Operations(cOperationName);
>>>explain SELECT * FROM Operations ORDER BY cOperationName;
>>> QUERY PLAN
>>>--------------------------------------------------------------
>>>---------
>>> Sort (cost=185.37..189.20 rows=1532 width=498)
>>> Sort Key: coperationname
>>> -> Seq Scan on operations (cost=0.00..104.32 rows=1532 width=498)
>>>(3 rows)
>>>
>>>Is this supposed to be so?
>>>
>>>
>>Since you are fetching the entire table, you are touching all the rows.
>>If the query were to fetch the rows in index order, it would be seeking
>>all over the table's tracks. By fetching in sequence order, it has a
>>much better chance of fetching rows in a way that minimizes head seeks.
>>Since disk I/O is generally 10-100x slower than RAM, the in-memory sort
>>can be surprisingly slow and still beat indexed disk access. Of course,
>>this is only true if the table can fit and be sorted entirely in memory
>>(which, with 1500 rows, probably can).
>>
>>
>
>Out of curiosity, what are the results of the following queries?
>(Queries run twice to make sure time differences aren't due to
>caching.)
>
>SET enable_seqscan TO on;
>SET enable_indexscan TO off;
>EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
>EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
>
>SET enable_seqscan TO off;
>SET enable_indexscan TO on;
>EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
>EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
>
>SELECT version();
>
>With 1500 rows of random data, I consistently see better performance
>with an index scan (about twice as fast as a sequence scan), and
>the planner uses an index scan if it has a choice (i.e., when
>enable_seqscan and enable_indexscan are both on). But my test case
>and postgresql.conf settings might be different enough from yours
>to account for different behavior.
>
>
>
Here is the output from the statements above. I know the times seem too
small to care, but what triggered my question is the fact that in the
logs there are a lot of lines like (i replaced the list of 43 fields
with *).
I use ODBC (8.0.1.1) and to change the application to cache the table
isn't feasible.

2005-04-19 10:07:05 LOG: duration: 937.000 ms statement: PREPARE
"_PLAN35b0068" as SELECT * FROM Operations ORDER BY
cOperationName;EXECUTE "_PLAN35b0068"
2005-04-19 10:07:09 LOG: duration: 1344.000 ms statement: PREPARE
"_PLAN35b0068" as SELECT * FROM Operations ORDER BY
cOperationName;EXECUTE "_PLAN35b0068"
2005-04-19 10:07:15 LOG: duration: 1031.000 ms statement: PREPARE
"_PLAN35b0068" as SELECT * FROM Operations ORDER BY
cOperationName;EXECUTE "_PLAN35b0068"
2005-04-19 10:07:19 LOG: duration: 734.000 ms statement: PREPARE
"_PLAN35b0068" as SELECT * FROM Operations ORDER BY
cOperationName;EXECUTE "_PLAN35b0068"

The times reported by explain analyze are so small though, the intervals
reported in pg_log are more real,

tkp=# SET enable_seqscan TO on;
SET
tkp=# SET enable_indexscan TO off;
SET
tkp=# EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Sort (cost=185.37..189.20 rows=1532 width=498) (actual
time=235.000..235.000 rows=1532 loops=1)
Sort Key: coperationname
-> Seq Scan on operations (cost=0.00..104.32 rows=1532 width=498)
(actual time=0.000..124.000 rows=1532 loops=1)
Total runtime: 267.000 ms
(4 rows)

tkp=# EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Sort (cost=185.37..189.20 rows=1532 width=498) (actual
time=16.000..16.000 rows=1532 loops=1)
Sort Key: coperationname
-> Seq Scan on operations (cost=0.00..104.32 rows=1532 width=498)
(actual time=0.000..0.000 rows=1532 loops=1)
Total runtime: 31.000 ms
(4 rows)

tkp=#
tkp=# SET enable_seqscan TO off;
SET
tkp=# SET enable_indexscan TO on;
SET
tkp=# EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------
Index Scan using operationsname on operations (cost=0.00..350.01
rows=1532 width=498) (actual time=16.000..62.000 rows=1532 loops=1)
Total runtime: 62.000 ms
(2 rows)

tkp=# EXPLAIN ANALYZE SELECT * FROM Operations ORDER BY cOperationName;
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------
Index Scan using operationsname on operations (cost=0.00..350.01
rows=1532 width=498) (actual time=0.000..16.000 rows=1532 loops=1)
Total runtime: 16.000 ms
(2 rows)

tkp=#
tkp=# SELECT version();
version
------------------------------------------------------------------------------------------
PostgreSQL 8.0.2 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC)
3.4.2 (mingw-special)
(1 row)

--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.18 - Release Date: 4/19/2005

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Shachindra Agarwal 2005-04-20 17:16:05 postgres slowdown question
Previous Message Greg Stark 2005-04-20 17:10:05 Re: Opteron vs Xeon (Was: What to do with 6 disks?)