Skip site navigation (1) Skip section navigation (2)

Re: ORDER BY ... LIMIT.. performance

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "john cartmell" <john(dot)cartmell(at)mediaburst(dot)co(dot)uk>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>,pgsql-performance(at)postgresql(dot)org
Subject: Re: ORDER BY ... LIMIT.. performance
Date: 2002-12-06 21:28:18
Message-ID: 13210.1039210098@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-performance
"john cartmell" <john(dot)cartmell(at)mediaburst(dot)co(dot)uk> writes:
> The query:
> SELECT * FROM tblcompany WHERE lower(companyname) like 'a g m%' ORDER BY
> companyname;
> returns 20 rows.
  ^^^^^^^^^^^^^^^

Ahh, light dawns.

> When limit is 19:
> 	EXPLAIN ANALYZE SELECT * FROM tblcompany WHERE
> lower(companyname) like 'a g m%' ORDER BY companyname LIMIT 19,0;
> 	NOTICE:  QUERY PLAN:

> 	Limit  (cost=0.00..4621.68 rows=19 width=223) (actual
> time=561.20..563.11 rows=19 loops=1)
> 	  ->  Index Scan using idx_tblcompany_companyname on tblcompany
> (cost=0.00..1542006.83 rows=6339 width=223) (actual time=561.19..563.07
> rows=20 loops=1)
> 	Total runtime: 563.22 msec

> But when it is 20:
> 	EXPLAIN ANALYZE SELECT * FROM tblcompany WHERE
> lower(companyname) like 'a g m%' ORDER BY companyname LIMIT 20,0;
> 	NOTICE:  QUERY PLAN:

> 	Limit  (cost=0.00..4864.92 rows=20 width=223) (actual
> time=559.58..21895.02 rows=20 loops=1)
> 	  ->  Index Scan using idx_tblcompany_companyname on tblcompany
> (cost=0.00..1542006.83 rows=6339 width=223) (actual
> time=559.57..21894.97 rows=20 loops=1)
> 	Total runtime: 21895.13 msec

The problem here is that in current releases, the Limit plan node tries
to fetch one more row than requested (you can see this in the actual
rowcounts for the first example).  So in your second example, the base
indexscan is actually being run to completion before the Limit gives up.
And since that scan is being used for ordering, not for implementing the
WHERE clause, it visits all the rows.  (When you leave off LIMIT, the
planner chooses a plan that's more amenable to fetching all the data...)

I recently revised the Limit logic so that it doesn't fetch the extra
row.  This takes more code, but you're not the first to complain of
the old behavior.  It'll be in 7.4, or if you're brave you could
probably apply the diff to 7.3.

In the meantime, a more appropriate query would be

SELECT * FROM tblcompany
WHERE lower(companyname) like 'a g m%'
ORDER BY lower(companyname)
LIMIT whatever

so that an index on lower(companyname) could be used both for the WHERE
clause and for the ordering.

			regards, tom lane

In response to

pgsql-performance by date

Next:From: Hannu KrosingDate: 2002-12-06 21:32:06
Subject: Re: Speeding up aggregates
Previous:From: Rod TaylorDate: 2002-12-06 21:24:00
Subject: Re: ALTER TABLE .. < ADD | DROP > OIDS

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group