Re: limit clause breaks query planner?

From: "Matt Smiley" <mss(at)rentrak(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <david(dot)west(at)cusppoint(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: limit clause breaks query planner?
Date: 2008-09-04 06:10:34
Message-ID: 48BF195E.D078.0028.0@rentrak.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> "Matt Smiley" <mss(at)rentrak(dot)com> writes:
> > So an Index Scan is always going to have a higher cost estimate than
> > an equivalent Seq Scan returning the same result rows (unless
> > random_page_cost is < 1). That's why I think the planner is always
> > preferring the plan that uses a Seq Scan.
>
> If that were the case, we'd never choose an indexscan at all...

You're right, that was a silly guess.

> It's true that a plain indexscan is not preferred for queries that will
> return a large fraction of the table. However, it should be willing to
> use a bitmap scan for this query, given default cost settings (the
> default cost settings will cause it to prefer bitmap scan for retrieving
> up to about a third of the table, in my experience). I too am confused
> about why it doesn't prefer that choice in the OP's example.

It looks like the bitmap scan has a higher cost estimate because the entire bitmap index must be built before beginning the heap scan and returning rows up the pipeline. The row-count limit can't be pushed lower than the bitmap-heap-scan like it can for the basic index-scan.

test_8_3_3=# set enable_seqscan = false ;
SET

test_8_3_3=# set enable_indexscan = false ;
SET

test_8_3_3=# explain analyze select * from my_table where a is null and b = 3 limit 15 ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=17070.22..17071.02 rows=15 width=8) (actual time=606.902..607.086 rows=15 loops=1)
-> Bitmap Heap Scan on my_table (cost=17070.22..69478.96 rows=988217 width=8) (actual time=606.892..606.983 rows=15 loops=1)
Recheck Cond: (b = 3)
Filter: (a IS NULL)
-> Bitmap Index Scan on idx_b (cost=0.00..16823.17 rows=1033339 width=0) (actual time=592.657..592.657 rows=1000000 loops=1)
Index Cond: (b = 3)
Total runtime: 607.340 ms
(7 rows)

> It would be interesting to alter the random_page_cost setting and see if he gets
> different results.

Using an unmodified postgresql.conf, the cost estimate for an index-scan were so much higher than for a seqscan that random_page_cost had to be set below 0.2 before the index-scan was preferred. However, it looks like this was mainly because effective_cache_size was too small. The planner thought the cache was only 128 MB, and the size of the complete table+index was 39492 + 21946 pages * 8 KB/block = 330 MB. It makes sense for the cost estimate to be so much higher if blocks are expected to be repeatedly re-fetched from disk. I wonder if David's effective_cache_size is too small.

test_8_3_3=# reset all ;
RESET

test_8_3_3=# explain analyze select * from my_table where a is null and b = 3 limit 15 ;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..2.50 rows=15 width=8) (actual time=0.036..0.239 rows=15 loops=1)
-> Seq Scan on my_table (cost=0.00..164492.74 rows=988217 width=8) (actual time=0.028..0.138 rows=15 loops=1)
Filter: ((a IS NULL) AND (b = 3))
Total runtime: 0.338 ms
(4 rows)

test_8_3_3=# set enable_seqscan = false ;
SET

test_8_3_3=# show random_page_cost ;
random_page_cost
------------------
4
(1 row)

test_8_3_3=# explain analyze select * from my_table where a is null and b = 3 limit 15 ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..45.99 rows=15 width=8) (actual time=0.051..0.200 rows=15 loops=1)
-> Index Scan using idx_b on my_table (cost=0.00..3029924.36 rows=988217 width=8) (actual time=0.043..0.100 rows=15 loops=1)
Index Cond: (b = 3)
Filter: (a IS NULL)
Total runtime: 0.308 ms
(5 rows)

test_8_3_3=# set random_page_cost = 0.19 ;
SET
test_8_3_3=# explain analyze select * from my_table where a is null and b = 3 limit 15 ;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..2.45 rows=15 width=8) (actual time=0.050..0.201 rows=15 loops=1)
-> Index Scan using idx_b on my_table (cost=0.00..161190.65 rows=988217 width=8) (actual time=0.042..0.097 rows=15 loops=1)
Index Cond: (b = 3)
Filter: (a IS NULL)
Total runtime: 0.307 ms
(5 rows)

Now fix effective_cache_size and try again.

test_8_3_3=# reset all ;
RESET

test_8_3_3=# set effective_cache_size = '500MB' ;
SET
test_8_3_3=# set enable_seqscan = false ;
SET
test_8_3_3=# explain analyze select * from my_table where a is null and b = 3 limit 15 ;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..2.78 rows=15 width=8) (actual time=0.051..0.204 rows=15 loops=1)
-> Index Scan using idx_b on my_table (cost=0.00..183361.21 rows=988217 width=8) (actual time=0.043..0.103 rows=15 loops=1)
Index Cond: (b = 3)
Filter: (a IS NULL)
Total runtime: 0.311 ms
(5 rows)

That's better, but still not quite low enough cost estimate to beat the seqscan. Try adjusting random_page_cost again.

test_8_3_3=# set random_page_cost = 3 ;
SET
test_8_3_3=# explain analyze select * from my_table where a is null and b = 3 limit 15 ;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..2.16 rows=15 width=8) (actual time=0.052..0.202 rows=15 loops=1)
-> Index Scan using idx_b on my_table (cost=0.00..142053.51 rows=988217 width=8) (actual time=0.043..0.100 rows=15 loops=1)
Index Cond: (b = 3)
Filter: (a IS NULL)
Total runtime: 0.311 ms
(5 rows)

That's enough: index-scan's 142053.51 beats seqscan's 164492.74. We no longer need to set enable_seqscan=false.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Thomas Finneid 2008-09-04 08:49:28 Re: slow update of index during insert/copy
Previous Message Scott Marlowe 2008-09-03 16:43:46 Re: Partitions number limitation ?