Re: Using LIMIT changes index used by planner

From: Sven Willenberger <sven(at)dmv(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: andrew(at)catalyst(dot)net(dot)nz
Subject: Re: Using LIMIT changes index used by planner
Date: 2004-12-13 22:06:40
Message-ID: 41BE1270.7060908@dmv.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Andrew McMillan wrote:
> On Mon, 2004-12-13 at 01:13 -0500, Sven Willenberger wrote:
>
>>I have a question regarding a serious performance hit taken when using a
>>LIMIT clause. I am using version 7.4.6 on FreeBSD 4.10-STABLE with 2GB
>>of memory. The table in question contains some 25 million rows with a
>>bigserial primary key, orderdate index and a referrer index. The 2
>>select statements are as follow:
>
>
> It's an interesting question, but to be able to get answers from this
> list you will need to provide "EXPLAIN ANALYZE ..." rather than just
> "EXPLAIN ...".
>

A) Query without limit clause:
explain analyze select storelocation,order_number from custacct where
referrer = 1365 and orderdate between '2004-12-07' and '2004-12-07
12:00:00' order by custacctid;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=1226485.32..1226538.78 rows=21382 width=43) (actual
time=30340.322..30426.274 rows=21432 loops=1)
Sort Key: custacctid
-> Index Scan using orderdate_idx on custacct
(cost=0.00..1224947.52 rows=21382 width=43) (actual
time=159.218..30196.686 rows=21432 loops=1)
Index Cond: ((orderdate >= '2004-12-07 00:00:00'::timestamp
without time zone) AND (orderdate <= '2004-12-07 12:00:00'::timestamp
without time zone))
Filter: (referrer = 1365)
Total runtime: 30529.151 ms
(6 rows)

************************************

A2) Same query run again, to see effect of caching:
explain analyze select storelocation,order_number from custacct where
referrer = 1365 and orderdate between '2004-12-07' and '2004-12-07
12:00:00' order by custacctid;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=1226485.32..1226538.78 rows=21382 width=43) (actual
time=1402.410..1488.395 rows=21432 loops=1)
Sort Key: custacctid
-> Index Scan using orderdate_idx on custacct
(cost=0.00..1224947.52 rows=21382 width=43) (actual time=0.736..1259.964
rows=21432 loops=1)
Index Cond: ((orderdate >= '2004-12-07 00:00:00'::timestamp
without time zone) AND (orderdate <= '2004-12-07 12:00:00'::timestamp
without time zone))
Filter: (referrer = 1365)
Total runtime: 1590.675 ms
(6 rows)

***********************************

B) Query run with LIMIT

explain analyze select storelocation,order_number from custacct where
referrer = 1365 and orderdate between '2004-12-07' and '2004-12-07
12:00:00' order by custacctid limit 10;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..43065.76 rows=10 width=43) (actual
time=1306957.216..1307072.111 rows=10 loops=1)
-> Index Scan using custacct2_pkey on custacct
(cost=0.00..92083209.38 rows=21382 width=43) (actual
time=1306957.205..1307072.017 rows=10 loops=1)
Filter: ((referrer = 1365) AND (orderdate >= '2004-12-07
00:00:00'::timestamp without time zone) AND (orderdate <= '2004-12-07
12:00:00'::timestamp without time zone))
Total runtime: 1307072.231 ms
(4 rows)

************************************

C) Query using the subselect variation

explain analyze select foo.storelocation, foo.order_number from (select
storelocation,order_number from custacct where referrer = 1365 and
orderdate between '2004-12-07' and '2004-12-07 12:00:00' order by
custacctid) as foo limit 10;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1226485.32..1226485.45 rows=10 width=100) (actual
time=1413.829..1414.024 rows=10 loops=1)
-> Subquery Scan foo (cost=1226485.32..1226752.60 rows=21382
width=100) (actual time=1413.818..1413.933 rows=10 loops=1)
-> Sort (cost=1226485.32..1226538.78 rows=21382 width=43)
(actual time=1413.798..1413.834 rows=10 loops=1)
Sort Key: custacctid
-> Index Scan using orderdate_idx on custacct
(cost=0.00..1224947.52 rows=21382 width=43) (actual time=0.740..1272.380
rows=21432 loops=1)
Index Cond: ((orderdate >= '2004-12-07
00:00:00'::timestamp without time zone) AND (orderdate <= '2004-12-07
12:00:00'::timestamp without time zone))
Filter: (referrer = 1365)
Total runtime: 1418.964 ms
(8 rows)

Thanks,
Sven

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2004-12-13 22:43:07 Re: Using LIMIT changes index used by planner
Previous Message Alvaro Nunes Melo 2004-12-13 19:32:02 Re: Similar tables, different indexes performance