Skip site navigation (1) Skip section navigation (2)

Re: IN operator causes sequential scan (vs. multiple OR expressions)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ryan Holmes <ryan(at)hyperstep(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: IN operator causes sequential scan (vs. multiple OR expressions)
Date: 2007-01-28 01:56:39
Message-ID: 26081.1169949399@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-performance
Ryan Holmes <ryan(at)hyperstep(dot)com> writes:
> So, yes, disabling seqscan does force an index scan for the IN  
> version. My question now is, how do I get PostgreSQL to make the  
> "right" decision without disabling seqscan?

I pinged you before because in a trivial test case I got
indexscans out of both text and varchar cases:

regression=# create table foo (f1 text unique, f2 varchar(25) unique);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_f1_key" for table "foo"
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_f2_key" for table "foo"
CREATE TABLE
regression=# explain select * from foo where f1 in ('foo', 'bar');
                               QUERY PLAN                                
-------------------------------------------------------------------------
 Bitmap Heap Scan on foo  (cost=4.52..9.86 rows=2 width=61)
   Recheck Cond: (f1 = ANY ('{foo,bar}'::text[]))
   ->  Bitmap Index Scan on foo_f1_key  (cost=0.00..4.52 rows=2 width=0)
         Index Cond: (f1 = ANY ('{foo,bar}'::text[]))
(4 rows)

regression=# explain select * from foo where f2 in ('foo', 'bar');
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Bitmap Heap Scan on foo  (cost=6.59..17.27 rows=10 width=61)
   Recheck Cond: ((f2)::text = ANY (('{foo,bar}'::character varying[])::text[]))
   ->  Bitmap Index Scan on foo_f2_key  (cost=0.00..6.59 rows=10 width=0)
         Index Cond: ((f2)::text = ANY (('{foo,bar}'::character varying[])::text[]))
(4 rows)

But on closer inspection the second case is not doing the right thing:
notice the rowcount estimate is 10, whereas it should be only 2 because
of the unique index on f2.  I poked into it and realized that in 8.2
scalararraysel() fails to deal with binary-compatible datatype cases,
instead falling back to a not-very-bright generic estimate.

I've committed a fix for 8.2.2, but in the meantime maybe you could
change your varchar column to text?

			regards, tom lane

In response to

Responses

pgsql-performance by date

Next:From: Ryan HolmesDate: 2007-01-28 02:31:45
Subject: Re: IN operator causes sequential scan (vs. multiple OR expressions)
Previous:From: Ryan HolmesDate: 2007-01-28 01:34:12
Subject: Re: IN operator causes sequential scan (vs. multiple OR expressions)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group