Skip site navigation (1) Skip section navigation (2)

Re: BUG #1470: Boolean expression index not used when it could be

From: Sergey Koshcheyev <sergey(dot)p(dot)k(at)gmail(dot)com>
To: psql-bugs(at)postgresql(dot)org
Subject: Re: BUG #1470: Boolean expression index not used when it could be
Date: 2005-02-26 13:09:39
Message-ID: 42207513.3010701@gmail.com (view raw or flat)
Thread:
Lists: pgsql-bugs
Tom Lane wrote:
> "Sergey Koshcheyev" <sergey(dot)p(dot)k(at)hotmail(dot)com> writes:
> 
>>I'm trying to optimize "is null" queries, since PgSQL doesn't index null
>>values. I have found that creating an expression index on (column is null)
>>could work, but it doesn't get used unless the index expression is part of a
>>comparison. Could this be improved, so that (a boolean expression) is taken
>>as equivalent to (a boolean expression = true)?
> 
> 
> You would be better off to use a partial index:
> 
> 	create index tbl1_abc on tbl1 (somecol) where abc is null;
> 
> The advantage of this is that not only do you get the is-null filter,
> but you may be able to filter on some other column(s) at the same time.
> For instance if you commonly query
> 
> 	select ... from tbl1 where abc is null and def > 42
> 
> then making "somecol" be "def" would be a winner.
> 
> 			regards, tom lane

Well, I have tried playing with indexes, but it's hard to define the 
right indexes for the query, even if NULLs are not involved. One more 
example (based on real-world data, except I renamed the fields and the 
table):

office=> explain analyze select * from table where (field1 = 0) AND 
((field2 IS NULL) OR (field3 = 0));
                                                  QUERY PLAN
(a line of dashes stripped to allow wrapping)
  Seq Scan on table  (cost=0.00..2103.92 rows=3091 width=174) (actual 
time=0.176..2413.078 rows=15 loops=1)
    Filter: ((field1 = 0) AND ((field2 IS NULL) OR (field3 = 0)))
  Total runtime: 2413.349 ms
(3 rows)

- Note that the estimate is way off, and raising statistics actually 
makes it even worse. There is an index defined on (field1), but it 
doesn't get picked up (probably because lots of rows have field1 equal 
0). I have tried to index (field1) where ((field2 IS NULL) OR (field3 = 
0)), but it doesn't get picked up either.

office=> explain analyze select * from table where (field1 = 0) AND 
((field2 IS NULL) OR (field3 = 0)) = true;
                                                             QUERY PLAN
(a line of dashes stripped to allow wrapping)
  Index Scan using table_index on table  (cost=0.00..1041.36 rows=289 
width=174) (actual time=89.689..100.895 rows=15 loops=1)
    Index Cond: (((field1 = 0) AND ((field2 IS NULL) OR (field3 = 0)))) 
= true)
  Total runtime: 101.580 ms
(3 rows)

- This query runs much faster, but requires the "= true".

Would the change I suggested (adding the "= true" automatically) be 
difficult to implement? We have about 10-20 various queries defined 
using such conditions, and they are run frequently, so it would help us 
a lot. Or are there any other ways to make it run fast which would not 
require me to change the database schema?

Thanks,
Sergey

P.S. I know that this stuff belongs more to psql-performance list, I'm 
writing it all here just to justify my request.

In response to

pgsql-bugs by date

Next:From: postgresbugsDate: 2005-02-26 13:56:33
Subject: Re: PGPASSWORD
Previous:From: Oliver JowettDate: 2005-02-26 11:26:28
Subject: Re: PGPASSWORD

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group