planner question..

From: Rajesh Kumar Mallah <mallah(at)trade-india(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: planner question..
Date: 2003-04-17 08:34:10
Message-ID: 200304171404.10306.mallah@trade-india.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi,

For a distribution of data like below why does the planner
choses to do an index scan by default for source = 'REGIS' when > 50%
of the rows are having source='REGIS'.

Of course index produced better results 22 sec versus 4 secs.
but can someone explain this case.

tradein_clients=# SELECT source,count(*) from profile_master group by source;
+----------+--------+
| source | count |
+----------+--------+
| BRANDING | 64008 |
| CATALOG | 711 |
| EYP | 10380 |
| IID | 349 |
| IP | 493 |
| REGIS | 102090 |
+----------+--------+
(6 rows)

tradein_clients=# SELECT count(*) from profile_master ;
+--------+
| count |
+--------+
| 178031 |
+--------+
(1 row)

tradein_clients=# SET enable_indexscan=off;
tradein_clients=# explain analyze SELECT count(*) from profile_master where source='REGIS';
+------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+------------------------------------------------------------------------------------------------------------------------------+
| Aggregate (cost=23982.58..23982.58 rows=1 width=0) (actual time=22872.97..22872.97 rows=1 loops=1) |
| -> Seq Scan on profile_master (cost=0.00..23970.40 rows=4871 width=0) (actual time=328.44..22730.69 rows=102090 loops=1) |
| Filter: (source = 'REGIS'::character varying) |
| Total runtime: 22873.03 msec |
+------------------------------------------------------------------------------------------------------------------------------+
(4 rows)

tradein_clients=# SET enable_indexscan=on;
SET
tradein_clients=# explain analyze SELECT count(*) from profile_master where source='REGIS';
+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Aggregate (cost=18225.48..18225.48 rows=1 width=0) (actual time=5919.24..5919.24 rows=1 loops=1) |
| -> Index Scan using profile_master_index_source on profile_master (cost=0.00..18213.31 rows=4871 width=0) (actual time=9.43..5786.15 rows=102090 loops=1) |
| Index Cond: (source = 'REGIS'::character varying) |
| Total runtime: 5919.31 msec |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
(4 rows)

--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Rajesh Kumar Mallah 2003-04-17 10:00:45 Re: IN Qeury Problem
Previous Message Rajesh Kumar Mallah 2003-04-17 08:21:19 analyse question..