extending ORDER BY make query much slower

From: "Dan Langille" <dan(at)langille(dot)org>
To: pgsql-sql(at)postgresql(dot)org
Subject: extending ORDER BY make query much slower
Date: 2003-03-12 19:36:08
Message-ID: 3E6F45D8.15284.5994716@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

NOTE: I've also put the analyse output at
http://www.freshports.org/tmp/faster.txt in case it's easier to read
than the format below.

I've found that adding another field to the ORDER BY clause, times go
through the roof.

This query is pretty fast.

explain analyse
SELECT element.id,
commit_log.id,
commit_log.message_id,
commit_log.committer,
commit_log.commit_date,
commit_log.description,
commit_log.encoding_losses,
commit_log.message_subject,
element.name,
commit_log_elements.revision_name,
element_pathname(element.id) AS element_pathname
FROM commit_log, commit_log_elements, element
WHERE commit_log_elements.commit_log_id = commit_log.id
AND commit_log_elements.element_id = element.id
ORDER BY commit_log.commit_date DESC
LIMIT 100;

Limit (cost=0.00..468.79 rows=100 width=261) (actual
time=1.80..78.10 rows=100 loops=1)
-> Nested Loop (cost=0.00..1248045.68 rows=266228 width=261)
(actual time=1.78..77.19 rows=101 loops=1)
-> Nested Loop (cost=0.00..307745.31 rows=266228
width=244) (actual time=0.36..5.41 rows=101 loops=1)
-> Index Scan Backward using commit_log_commit_date
on commit_log (cost=0.00..34305.03 rows=70543 width=227) (actual
time=0.30..1.17 rows=32 loops=1)
-> Index Scan using commit_log_elements_clid on
commit_log_elements (cost=0.00..3.46 rows=33 width=17) (actual
time=0.03..0.07 rows=3 loops=32)
Index Cond: (commit_log_elements.commit_log_id =
"outer".id)
-> Index Scan using element_pkey on element
(cost=0.00..3.52 rows=1 width=17) (actual time=0.08..0.09 rows=1
loops=101)
Index Cond: ("outer".element_id = element.id)
Total runtime: 79.52 msec

If I add another item to the ORDER BY, the times sky rocket. I
haven't let the query run to completion, but here's the analyse
output:

explain analyse
SELECT element.id,
commit_log.id AS my_commit_log_id,
commit_log.message_id,
to_char(commit_log.commit_date - SystemTimeAdjust(), 'DD Mon
YYYY') AS commit_date,
to_char(commit_log.commit_date - SystemTimeAdjust(),
'HH24:MI') AS commit_time,
commit_log.committer,
commit_log.description,
commit_log.encoding_losses,
commit_log.message_subject,
element.name,
commit_log_elements.revision_name,
element_pathname(element.id) AS element_pathname
FROM commit_log, commit_log_elements, element
WHERE commit_log_elements.commit_log_id = commit_log.id
AND commit_log_elements.element_id = element.id
ORDER BY commit_log.commit_date DESC,
my_commit_log_id
LIMIT 100;

explain
SELECT element.id,
commit_log.id AS my_commit_log_id,
commit_log.message_id,
commit_log.committer,
commit_log.description,
commit_log.encoding_losses,
commit_log.message_subject,
element.name,
commit_log_elements.revision_name,
element_pathname(element.id) AS element_pathname
FROM commit_log, commit_log_elements, element
WHERE commit_log_elements.commit_log_id = commit_log.id
AND commit_log_elements.element_id = element.id
ORDER BY commit_log.commit_date DESC,
my_commit_log_id
LIMIT 100;

QUERY PLAN
Limit (cost=196855.28..196855.53 rows=100 width=261)
-> Sort (cost=196855.28..197520.85 rows=266228 width=261)
Sort Key: commit_log.commit_date, commit_log.id
-> Hash Join (cost=9478.25..38465.07 rows=266228
width=261)
Hash Cond: ("outer".commit_log_id = "inner".id)
-> Hash Join (cost=3503.91..19098.46 rows=266228
width=34)
Hash Cond: ("outer".element_id = "inner".id)
-> Seq Scan on commit_log_elements
(cost=0.00..4703.28 rows=266228 width=17)
-> Hash (cost=2463.73..2463.73 rows=132073
width=17)
-> Seq Scan on element
(cost=0.00..2463.73 rows=132073 width=17)
-> Hash (cost=3133.43..3133.43 rows=70543 width=227)
-> Seq Scan on commit_log (cost=0.00..3133.43
rows=70543 width=227)
(12 rows)

If I restructure the query, I can get times pretty close to the
original query:

explain analyse
SELECT * FROM (
SELECT element.id,
commit_log.id AS my_commit_log_id,
commit_log.message_id,
commit_log.commit_date,
commit_log.committer,
commit_log.description,
commit_log.encoding_losses,
commit_log.message_subject,
element.name,
commit_log_elements.revision_name,
element_pathname(element.id) AS element_pathname
FROM commit_log, commit_log_elements, element
WHERE commit_log_elements.commit_log_id = commit_log.id
AND commit_log_elements.element_id = element.id
ORDER BY commit_log.commit_date DESC
LIMIT 100) AS TEMP
order by commit_date, my_commit_log_id;

Sort (cost=472.11..472.36 rows=100 width=261) (actual
time=80.01..80.40 rows=100 loops=1)
Sort Key: commit_date, my_commit_log_id
-> Subquery Scan "temp" (cost=0.00..468.79 rows=100 width=261)
(actual time=1.66..78.64 rows=100 loops=1)
-> Limit (cost=0.00..468.79 rows=100 width=261) (actual
time=1.63..76.75 rows=100 loops=1)
-> Nested Loop (cost=0.00..1248045.68 rows=266228
width=261) (actual time=1.62..75.85 rows=101 loops=1)
-> Nested Loop (cost=0.00..307745.31
rows=266228 width=244) (actual time=0.34..5.38 rows=101 loops=1)
-> Index Scan Backward using
commit_log_commit_date on commit_log (cost=0.00..34305.03 rows=70543
width=227) (actual time=0.27..1.17 rows=32 loops=1)
-> Index Scan using
commit_log_elements_clid on commit_log_elements (cost=0.00..3.46
rows=33 width=17) (actual time=0.03..0.07 rows=3 loops=32)
Index Cond:
(commit_log_elements.commit_log_id = "outer".id)
-> Index Scan using element_pkey on element
(cost=0.00..3.52 rows=1 width=17) (actual time=0.08..0.09 rows=1
loops=101)
Index Cond: ("outer".element_id =
element.id)
Total runtime: 81.04 msec
(12 rows)

Any comments?
--
Dan Langille : http://www.langille.org/

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Jon Griffin 2003-03-12 19:59:12 Re: Create function statement with insert statement
Previous Message Tomasz Myrta 2003-03-12 18:47:07 Re: LEFT JOIN and missing values