Re: How to boost performance of queries containing pattern matching characters

From: Richard Huxton <dev(at)archonet(dot)com>
To: gnanam(at)zoniac(dot)com
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: How to boost performance of queries containing pattern matching characters
Date: 2011-02-14 07:56:56
Message-ID: 4D58E048.9020506@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 14/02/11 07:46, Gnanakumar wrote:
>> If you really need to match all those options, you can't use an index. A
>> substring-matching index would need to have multiple entries per
>> character per value (since it doesn't know what you will search for).
>> The index-size becomes unmanageable very quickly.
>
>> That's why I asked what you really wanted to match.
> To be more specific, in fact, our current application allows to delete
> email(s) with a minimum of 3 characters. There is a note/warning also given
> for application Users' before deleting, explaining the implication of this
> delete action (partial& case-insensitive, and it could be wide-ranging
> too).
>
>> So, I'll ask again: do you really want to match all of those options?
> Yes, as explained above, I want to match all those.

Then you can't use a simple index. If you did use an index it would
probably be much slower for "com" or "yah" or "gma" and so on.

The closest you can do is something like Artur's option (or the pg_trgm
module - handy since you are looking at 3-chars and up) to select likely
matches combined with a separate search on '%domain.com%' to confirm
that fact.

P.S. - I'd be inclined to just match the central domain parts, so for
"user1(at)europe(dot)megacorp(dot)com" you would index "europe" and "megacorp" and
only allow matching on the start of each string. Of course if your
application spec says you need to match on "p.c" too then that's what
you have to do.

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gnanakumar 2011-02-14 08:02:14 Re: How to boost performance of queries containing pattern matching characters
Previous Message Richard Huxton 2011-02-14 07:49:54 Re: How to boost performance of queries containing pattern matching characters