Use of index in 7.0 vs 6.5

From: Ryan Bradetich <ryan_bradetich(at)hp(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Use of index in 7.0 vs 6.5
Date: 2000-05-24 23:05:58
Message-ID: 392C6056.D5BF7FE8@hp.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Tom (Or anyone else who is good with PostgreSQL statistics),

I am in the process of transitioning from postgreSQL 6.5.3 to
postgreSQL 7.0. I ran into an issue where a sequential scan
is being choosen on postgreSQL 7.0 where an index scan was
choosen on postgreSQL 6.5.3.

Note: All tables have been freshly vacuum'd and analyzed.

procman=# select version();
version
-------------------------------------------------------------------
PostgreSQL 7.0.0 on hppa2.0w-hp-hpux11.00, compiled by gcc 2.95.2
(1 row)

procman=# explain select count(catagory) from medusa where host_id = 404
and catagory like 'A%';
NOTICE: QUERY PLAN:
Aggregate (cost=189546.19..189546.19 rows=1 width=12)
-> Seq Scan on medusa (cost=0.00..189529.43 rows=6704 width=12)
EXPLAIN

Note: The above query produces an index scan on postgreSQL 6.5.3.

procman=# set enable_seqscan = off;
SET VARIABLE

procman=# explain select count(catagory) from medusa where host_id = 404
and catagory like 'A%';
NOTICE: QUERY PLAN:
Aggregate (cost=207347.36..207347.36 rows=1 width=12)
-> Index Scan using medusa_host_id_key on medusa
(cost=0.00..207330.60 rows=6704 width=12)
EXPLAIN

Here are the statistics:

procman=# select attname,attdisbursion,s.*
procman-# from pg_statistic s, pg_attribute a, pg_class c
procman-# where starelid = c.oid and attrelid = c.oid and staattnum =
attnum
procman-# and relname = 'medusa';
attname | attdisbursion | starelid | staattnum | staop | stanullfrac
| stacommonfrac |
stacommonval |
staloval | stahival
-----------+---------------+----------+-----------+-------+-------------+---------------+-----------------------------------------------------------------------------+----------------------------

--------------------------------------+-----------------------------------------

host_id | 0.00621312 | 30874288 | 1 | 97 | 0
| 0.0279425 |
446
| 0
| 11011
(1 row)

Here is my analysis of the stastics (based on the examples in the
archive).

The most common value host_id in the table is 446 with row fraction of
~ 2.8%.
The estimated number of rows in the index is 6704. This table has
4,630,229
entries in the table.

Hopefully this analysis is correct, if not .. please correct me :)

I do not understand why the planner would choose a seqscan over the
index scan because
6704/4,630,229 is ~ 0.15%.

Thanks for your time,

Ryan

- Ryan

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2000-05-24 23:07:47 Re: possible bug with group by?
Previous Message Julie Hunt 2000-05-24 23:05:43 Re: possible bug with group by?