Re: Incorrect behavior with CE and ORDER BY

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Incorrect behavior with CE and ORDER BY
Date: 2006-10-24 18:58:01
Message-ID: 20061024185801.GQ26593@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


I followed up with Joshua on Jabber. This is the query:

SELECT pse_userid FROM tbl_profile_search WHERE pse_normalized_text='1'
and pse_interest_type = 10 order by pse_lastlogin DESC limit 50 offset 0

I suggested adding an index on (pse_normalized_text, pse_lastlogin), on
the assumption that the planner would get the sorted output from there
and be able to push the LIMIT clause, just below the indexscan, thus
saving the big heap scan (and a sort across a large result set). But it
turns out the index is already there.

So it seems to me to be a planner shortcoming. Is this correct?

My idea of the plan would be (tabs=8 spaces)

Limit (50)
Sort (key: pse_lastlogin)
Result
Append
SeqScan tbl_profile_search
Indexscan tbl_profile_search_interest_1
Limit (50)
IndexScan on the index mentioned above

Is this possible? It would be very fast. Maybe it should be like this
instead:

Limit (50)
Sort (key: pse_lastlogin)
Result
Append
Limit (50)
SeqScan tbl_profile_search
Limit (50)
Indexscan tbl_profile_search_interest_1
Limit (50)
IndexScan on the index mentioned above

This is the actual plan:

> Limit (cost=47110.19..47110.31 rows=50 width=8) (actual
> time=6088.013..6088.269 rows=50 loops=1)
> -> Sort (cost=47110.19..47943.14 rows=333179 width=8) (actual
> time=6088.007..6088.104 rows=50 loops=1)
> Sort Key: public.tbl_profile_search.pse_lastlogin
> -> Result (cost=0.00..16547.78 rows=333179 width=8) (actual
> time=0.020..4339.472 rows=334319 loops=1)
> -> Append (cost=0.00..16547.78 rows=333179 width=8)
> (actual time=0.016..3208.022 rows=334319 loops=1)
> -> Seq Scan on tbl_profile_search
> (cost=0.00..2.27 rows=1 width=8) (actual time=0.012..0.047 rows=2 loops=1)
> Filter: (((pse_normalized_text)::text =
> '1'::text) AND (pse_interest_type = 10))
> -> Index Scan using index_pse_09_on_part_1 on
> tbl_profile_search_interest_1 tbl_profile_search (cost=0.00..4.73
> rows=1 width=8) (actual time=0.202..0.202 rows=0 loops=1)
> Index Cond: ((pse_normalized_text)::text =
> '1'::text)
> Filter: (pse_interest_type = 10)
> -> Bitmap Heap Scan on
> tbl_profile_search_interest_10 tbl_profile_search
> (cost=3579.12..16540.78 rows=333177 width=8) (actual
> time=90.619..2116.224 rows=334317 loops=1)
> Recheck Cond: ((pse_normalized_text)::text =
> '1'::text)
> Filter: (pse_interest_type = 10)
> -> Bitmap Index Scan on
> index_pse_09_on_part_10 (cost=0.00..3579.12 rows=333177 width=0)
> (actual time=89.052..89.052 rows=340964 loops=1)
> Index Cond:
> ((pse_normalized_text)::text = '1'::text)
> Total runtime: 6103.190 ms

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Sullivan 2006-10-24 19:00:34 Re: Conference materials (Was: [HACKERS] pdfs of
Previous Message Josh Berkus 2006-10-24 18:37:46 Re: [HACKERS] Replication documentation addition