Re: How to boost performance of queries containing pattern matching characters

From: Richard Huxton <dev(at)archonet(dot)com>
To: Artur Zając <azajac(at)ang(dot)com(dot)pl>
Cc: gnanam(at)zoniac(dot)com, pgsql-performance(at)postgresql(dot)org
Subject: Re: How to boost performance of queries containing pattern matching characters
Date: 2011-02-14 07:49:54
Message-ID: 4D58DEA2.3040202@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 14/02/11 07:38, Artur Zając wrote:
> I had almost the same problem.
> To resolve it, I created my own text search parser (myftscfg) which divides
> text in column into three letters parts, for example:
>
> someemail(at)domain(dot)com is divided to som, ome,mee,eem,ema,mai,ail,il@,
> l(at)d,@do,dom,oma,mai,ain,in.,n.c,.co,com
>
> There should be also index on email column:
>
> CREATE INDEX "email _fts" on mytable using gin
> (to_tsvector('myftscfg'::regconfig, email))
>
> Every query like email ilike '%domain.com%' should be rewrited to:
>
> WHERE
> to_tsvector('myftscfg',email) @@ to_tsquery('dom') AND
> to_tsvector('myftscfg',email) @@ to_tsquery('oma') AND
> to_tsvector('myftscfg',email) @@ to_tsquery('mai') AND
...

Looks like you've almost re-invented the trigram module:
http://www.postgresql.org/docs/9.0/static/pgtrgm.html

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Huxton 2011-02-14 07:56:56 Re: How to boost performance of queries containing pattern matching characters
Previous Message Gnanakumar 2011-02-14 07:46:07 Re: How to boost performance of queries containing pattern matching characters