Wrong number of rows estimation by the planner

From: "Yonatan Ben-Nes" <yonatan(at)epoch(dot)co(dot)il>
To: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Wrong number of rows estimation by the planner
Date: 2008-02-08 12:12:09
Message-ID: 2d0127b80802080412o741fa5a1p3a3ae241e21574f7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi all,

When I'm doing an explain analyze to a query of mine I notice that the
number of estimated rows by the planner is a lot smaller then the actual
number of rows, I'm afraid that this make my queries slower.

A query for example is:

EXPLAIN ANALYZE
SELECT product_id,product_name
FROM product
WHERE product_keywords_vector @@ plainto_tsquery('default', 'black') AND
rank(product_keywords_vector, plainto_tsquery('default', 'black')) > 0.4 AND
product_status = TRUE AND product_type = 'comparison'
ORDER BY ((product_buy_number * 4) + product_view_number + 1) *
rank(product_keywords_vector, plainto_tsquery('default', 'black')) DESC;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=10543.67..10544.81 rows=455 width=297) (actual time=
1098.188..1104.606 rows=22248 loops=1)
Sort Key: (((((product_buy_number * 4) + product_view_number) +
1))::double precision * rank(product_keywords_vector, '''black'''::tsquery))
-> Bitmap Heap Scan on product (cost=287.13..10523.59 rows=455
width=297) (actual time=50.496..1071.900 rows=22248 loops=1)
Recheck Cond: (product_keywords_vector @@ '''black'''::tsquery)
Filter: ((rank(product_keywords_vector, '''black'''::tsquery) >
0.4::double precision) AND product_status AND (product_type =
'comparison'::text))
-> Bitmap Index Scan on product_product_keywords_vector (cost=
0.00..287.02 rows=2688 width=0) (actual time=26.385..26.385 rows=72507
loops=1)
Index Cond: (product_keywords_vector @@ '''black'''::tsquery)
Total runtime: 1111.507 ms
(8 rows)

Here as I understand it, at the Bitmap Index Scan on
product_product_keywords_vector the planner estimate that it will retrieve
2688 rows but it actually retrieve 72507 rows and later at the Bitmap Heap
Scan on product it estimate 455 rows and retrieve 22248 rows.

I increased the statistics of the field which the
product_product_keywords_vector index is built on by doing:
ALTER TABLE product ALTER COLUMN product_keywords_vector SET STATISTICS
1000;
ANALYZE;
REINDEX INDEX product_product_keywords_vector;

But it didn't change a thing.

Any ideas?

Thanks in advance,
Yonatan Ben-Nes

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Wong 2008-02-08 20:12:53 Re: Benchmark Data requested
Previous Message Tom Lane 2008-02-07 19:01:58 Re: index usage on arrays