null values / partial indices

From: "Mario Weilguni" <mario(dot)weilguni(at)icomedias(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: null values / partial indices
Date: 2002-11-13 12:43:42
Message-ID: 4D618F6493CE064A844A5D496733D6670391B1@freedom.icomedias.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I noticed that the planner is unable to select an index scan when a partial
index is available, the partial index is based on a "NOT NULL" condition.

Example:

start with no index:
mydb=# EXPLAIN ANALYZE select id from str where url='foobar';
NOTICE: QUERY PLAN:

Seq Scan on str (cost=0.00..88.91 rows=1 width=4) (actual time=5.93..5.93
rows=0 loops=1)
Total runtime: 6.01 msec

EXPLAIN
mydb=# create index str_idx_url on str(url) where url is not null;
CREATE
mydb=# analyze str;
ANALYZE
mydb=# EXPLAIN ANALYZE select id from str where url='foobar';
NOTICE: QUERY PLAN:

Seq Scan on str (cost=0.00..91.05 rows=3 width=4) (actual time=6.24..6.24
rows=0 loops=1)
Total runtime: 6.30 msec

EXPLAIN
mydb=# drop index str_idx_url;
DROP
mydb=# create index str_idx_url on str(url);
CREATE
mydb=# analyze str;
ANALYZE
mydb=# EXPLAIN ANALYZE select id from str where url='foobar';
NOTICE: QUERY PLAN:

Index Scan using str_idx_url on str (cost=0.00..2.56 rows=1 width=4) (actual
time=0.53..0.53 rows=0 loops=1)
Total runtime: 0.60 msec

EXPLAIN

It's no big deal in my application, speed is more than fast enough, I just
noticed it. The documentation says:
"However, keep in mind that the predicate must match the conditions used in
the queries that are supposed to benefit from the index. To be precise, a
partial index can be used in a query only if the system can recognize that
the query's WHERE condition mathematically implies the index's predicate.
PostgreSQL does not have a sophisticated theorem prover that can recognize
mathematically equivalent predicates that are written in different forms.
(Not only is such a general theorem prover extremely difficult to create, it
would probably be too slow to be of any real use.) The system can recognize
simple inequality implications, for example "x < 1" implies "x < 2";
otherwise the predicate condition must exactly match the query's WHERE
condition or the index will not be recognized to be usable. "

Normally a "IS NOT NULL"/"IS NULL" should be easy to recognise, since NULL is
very special. This would allow much smaller indices in some applications, for
example I've a case with a table with 200000 rows where 4 values (of type
text) are not null. The index size would be much smaller without all those
NULL values.

Best regards,
Mario Weilguni

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zeugswetter Andreas SB SD 2002-11-13 13:02:16 Re: null values / partial indices
Previous Message Nigel J. Andrews 2002-11-13 12:28:59 Re: RC1?