Skip site navigation (1) Skip section navigation (2)

7.0 like selectivity

From: "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp>
To: "pgsql-hackers" <pgsql-hackers(at)postgreSQL(dot)org>
Subject: 7.0 like selectivity
Date: 2000-04-06 15:01:05
Message-ID: NABBINCKAKFCDDKMMJHGKEMJEFAA.Inoue@tpf.co.jp (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi all,

There was a bug(??) report about LIKE optimization of
7.0 beta3 in Japan from Akira Imagawa.
It may be difficult to solve. 

Let t_hoge be a table like
{
	hoge_cd int4 primary key,
	shimeinn text,
	tel text,
	..
}
index hoge_ix2 on t_hoge(shimeinn).
index hoge_ix3 on t_hoge(tel).

There are 348236 rows in t_hoge.

For the query
select hoge_cd,shimeinn,tel
 from t_hoge
 where shimeinn like 'imag%'
   and tel like '012%'
 order by hoge_cd
 limit 100;

64 rows returned immediately.

And for the query
select hoge_cd,shimeinn,tel
 from t_hoge
 where shimeinn like 'imag%'
   and tel like '012-3%'
 order by hoge_cd
  limit 100;

24 rows returned after waiting 8 minutes.

I got the following output from him.
explain select * from t_hoge where tel like '012%';
	Index Scan using t_hoge_ix3 on t_hoge  (cost=0.00..0.23 rows=1981
	width=676)

explain select * from t_hoge where tel like '012-3%';
	Index Scan using t_hoge_ix3 on t_hoge  (cost=0.00..0.00 rows=1981
	width=676)

In fact,count(*) is 342323 and 114741 respectively.

The first problem is that estimated cost is too low.
It seems that the index selectivity of '012-3%' = the index
selectivity of '012%' / (256*256),right ? 
If so,does it give more practical estimation than before ?
It doesn't correspond to rows information either.

In reality, * shimeinn like 'imag%' * is much more restrictive
than * tel like '012-3%' *.  However I couldn't think of the
way to foresee which is more restrictive. Now I doubt whether
we have enough information to estimate LIKE selectivity
correctly. It's the second problem.

Comments ? 

Regards.

Hiroshi Inoue
Inoue(at)tpf(dot)co(dot)jp

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2000-04-06 15:05:09
Subject: Re: pg_dump and serial
Previous:From: Tom LaneDate: 2000-04-06 14:39:45
Subject: Re: 7.1 items

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group