Re: Is it possible to use index on column for regexp match operator '~'?

From: Marti Raudsepp <marti(at)juffo(dot)org>
To: Rural Hunter <ruralhunter(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Is it possible to use index on column for regexp match operator '~'?
Date: 2011-12-14 20:43:37
Message-ID: CABRT9RC==HByCGLRL3xHz05ng=19axvm0kEXLAGqgdHXJjjjpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

2011/12/14 Rural Hunter <ruralhunter(at)gmail(dot)com>:
> for example, the where condition is: where 'aaaa' ~ col1. I created a normal
> index on col1 but seems it is not used.

I assume you want to search values that match one particular pattern,
that would be col1 ~ 'aaaa'

The answer is, only very simple patterns that start with '^'. Note
that you MUST use the text_pattern_ops index opclass:

# create table words (word text);
# copy words from '/usr/share/dict/words';
# create index on words (word text_pattern_ops);
# explain select * from words where word ~ '^post';
Index Scan using words_word_idx on words (cost=0.00..8.28 rows=10 width=9)
Index Cond: ((word ~>=~ 'post'::text) AND (word ~<~ 'posu'::text))
Filter: (word ~ '^post'::text)

----

If you just want to search for arbitrary strings, in PostgreSQL 9.1+
you can use pg_trgm extension with a LIKE expression:

# create extension pg_trgm;
# create index on words using gist (word gist_trgm_ops);
# explain select * from words where word like '%post%';
Bitmap Heap Scan on words (cost=4.36..40.23 rows=10 width=9)
Recheck Cond: (word ~~ '%post%'::text)
-> Bitmap Index Scan on words_word_idx1 (cost=0.00..4.36 rows=10 width=0)
Index Cond: (word ~~ '%post%'::text)

----

There's also the "wildspeed" external module which is somewhat faster
at this: http://www.sai.msu.su/~megera/wiki/wildspeed

And someone is working to get pg_trgm support for arbitrary regular
expression searches. This *may* become part of the next major
PostgreSQL release (9.2)
http://archives.postgresql.org/message-id/CAPpHfduD6EGNise5codBz0KcdDahp7--MhFz_JDD_FRPC7-i=A@mail.gmail.com

Regards,
Marti

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Rural Hunter 2011-12-15 01:54:06 Re: Is it possible to use index on column for regexp match operator '~'?
Previous Message Kaloyan Iliev Iliev 2011-12-14 17:48:17 Re: Slow query after upgrade from 8.2 to 8.4