Skip site navigation (1) Skip section navigation (2)

Index not used past a certain threshold

From: Nishad Prakash <prakashn(at)uci(dot)edu>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Index not used past a certain threshold
Date: 2004-04-30 01:51:42
Message-ID: Pine.GSO.4.58.0404291839200.24603@e4e.oac.uci.edu (view raw or flat)
Thread:
Lists: pgsql-bugs


I seem to have run into an indexing problem in postgres 7.4 on Solaris 8.
An index on a certain table is not being used if a column referenced in
the query has more than a certain number of rows for a given value.
Here's an example:

I have a table like so:

      Table "public.a1"
 Column |  Type   | Modifiers
--------+---------+-----------
 aid    | integer | not null
Indexes:
    "a1b" hash (aid)


With an index like so:

Index "public.a1b"
 Column |  Type
--------+---------
 aid    | integer
hash, for table "public.a1"


If we group this table by "number of rows for a given aid", using this
query:
   select aid, count (aid) from a1 group by aid order by count (aid) using >;
then the largest rows are:

 aid  | count
------+-------
 4085 | 51039
 5065 | 45750
 5026 | 39224
 9010 | 31418
  527 | 30691
 5014 | 29421
 5010 | 24958
    7 | 20723
   57 | 19167
  722 | 17180
[...snip...]

Now, the index is *not* being used if I query for rows
where count >= 19167.

it=> explain select aid from a1 where aid = 57;
                        QUERY PLAN
----------------------------------------------------------
 Seq Scan on a1  (cost=0.00..17986.81 rows=15738 width=4)
   Filter: (aid = 57)

And the rows with higher counts as well.  But beneath that threshold, the
index is indeed being used:

it=> explain select aid from a1 where aid = 722;
                             QUERY PLAN
--------------------------------------------------------------------
 Index Scan using a1b on a1  (cost=0.00..4796.23 rows=1337 width=4)
   Index Cond: (aid = 722)


The actual table in which I found this bug has 12 columns, and the
index involves three columns, but the problem is reproducible in a
table with just one column, if that helps to simplify things for you.
I have tried using hash indices and b-tree indices, but in either
case, once you hit the aid values with the larger number of rows, you
get a sequential scan (although the threshold varies slightly depending
on which indexing method you use).

Can this be fixed or is it just a feature of postgres indexing?

Thanks,

Nishad
ps> I can provide the data for the table if it's any help.
-- 
"Underneath the concrete, the dream is still alive" -- Talking Heads


Responses

pgsql-bugs by date

Next:From: Tom LaneDate: 2004-04-30 04:08:13
Subject: Re: Index not used past a certain threshold
Previous:From: Bruno NecchiDate: 2004-04-30 00:44:45
Subject: Problems....

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group