Optimizer not using index on 120M row table

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Optimizer not using index on 120M row table
Date: 2003-04-08 03:29:03
Message-ID: 20030407222903.M31861@flake.decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

As background: this is the main table for http://stats.distributed.net

Table "public.email_contrib"
Column | Type | Modifiers
------------+---------------+-----------
id | integer | not null
team_id | integer | not null
date | date | not null
project_id | smallint | not null
work_units | numeric(20,0) | not null
Indexes: email_contrib_pkey primary key btree (project_id, id, date)

id is the id of a participant, team_id is the team they were on for that
day, date is the day the work was done, project_id is the project, and
work_units is the amount of work done.

explain select * from email_contrib where project_id=8 and id=39622 and
date='3/1/03';
QUERY PLAN
-------------------------------------------------------------------------------
Seq Scan on email_contrib (cost=0.00..2942185.40 rows=1 width=25)
Filter: ((project_id = 8) AND (id = 39622) AND (date =
'2003-03-01'::date))
(2 rows)

Is there any reason why this shouldn't be using the index? The
selectivity on project_id is very low (only 5 values for all 120M rows).

select attname, avg_width, n_distinct, correlation from pg_stats where
tablename='email_contrib';
attname | avg_width | n_distinct | correlation
------------+-----------+------------+-------------
id | 4 | 95184 | 0.496598
team_id | 4 | 1361 | 0.219478
date | 4 | 1769 | 0.329469
project_id | 2 | 5 | 1
work_units | 11 | 2100 | 0.0900541

It seems that not only should the query I explained be using the index,
but it seems that something like select sum(work_units) .. where
project_id=8 and id=39622 should also use the index.

I've run vacuum analyze to no effect.

On a related note, will pgsql do 'index covering', reading only the
index if it contains all the information a query needs? IE: in Sybase,
this query will only hit the index on Email_Contrib:

select date from email_contrib where project_id=8 and id=39622;

because email_contrib_pkey contains all required values.
--
Jim C. Nasby (aka Decibel!) jim(at)nasby(dot)net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Neil Conway 2003-04-08 03:49:27 Re: Optimizer not using index on 120M row table
Previous Message Lamar Owen 2003-04-08 02:51:48 Re: Failed dependencies: perl(Pg) is needed by postgresql-contrib