From: | Kevin Grittner <kgrittn(at)ymail(dot)com> |
---|---|
To: | Giuseppe Broccolo <giuseppe(dot)broccolo(at)2ndquadrant(dot)it>, Robert DiFalco <robert(dot)difalco(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Question about trigram GIST index |
Date: | 2014-12-18 20:03:13 |
Message-ID: | 123557937.571326.1418932993939.JavaMail.yahoo@jws100182.mail.ne1.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Giuseppe Broccolo wrote:
> I'm not sure about the '%' operator, but I'm sure that the GIST
> index will never be used in the
>
> SELECT * FROM users WHERE lower(name) LIKE '%john%';
>
> query; it is used for left or right anchored search, such as
> 'john%' or '%john'.
It *will* use a *trigram* index for a non-anchored search.
test=# create table words (word text not null);
CREATE TABLE
test=# copy words from '/usr/share/dict/words';
COPY 99171
test=# CREATE EXTENSION pg_trgm;
CREATE EXTENSION
test=# CREATE INDEX words_trgm ON words USING gist (word gist_trgm_ops);
CREATE INDEX
test=# vacuum analyze words;
VACUUM
test=# explain analyze select * from words where word like '%john%';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on words (cost=4.36..40.24 rows=10 width=9) (actual time=17.758..17.772 rows=8 loops=1)
Recheck Cond: (word ~~ '%john%'::text)
Rows Removed by Index Recheck: 16
Heap Blocks: exact=4
-> Bitmap Index Scan on words_trgm (cost=0.00..4.36 rows=10 width=0) (actual time=17.708..17.708 rows=24 loops=1)
Index Cond: (word ~~ '%john%'::text)
Planning time: 0.227 ms
Execution time: 17.862 ms
(8 rows)
test=# explain analyze select * from words where word ilike '%john%';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on words (cost=44.05..556.57 rows=1002 width=9) (actual time=12.151..12.197 rows=24 loops=1)
Recheck Cond: (word ~~* '%john%'::text)
Heap Blocks: exact=4
-> Bitmap Index Scan on words_trgm (cost=0.00..43.80 rows=1002 width=0) (actual time=12.124..12.124 rows=24 loops=1)
Index Cond: (word ~~* '%john%'::text)
Planning time: 0.392 ms
Execution time: 12.252 ms
(7 rows)
Note that a trigram index is case-insensitive; doing a
case-sensitive search requires an extra Recheck node to eliminate
the rows that match in the case-insensitive index scan but have
different capitalization. Because of that case-sensitive is
slower.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert DiFalco | 2014-12-22 20:53:03 | Number of Columns and Update |
Previous Message | Robert DiFalco | 2014-12-18 19:33:39 | Re: Question about trigram GIST index |