Re: Does pgsql's regex processor optimize Common-Prefix?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Kurapica <kurapica(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Does pgsql's regex processor optimize Common-Prefix?
Date: 2006-12-26 15:08:17
Message-ID: 4312.1167145697@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Kurapica wrote:
>> So I want to know whether pgsql's regex
>> processor can optimize regexes such as:
>> Nebraska|Nevada|North Carolina
>> to
>> N(e(braska|vada)|orth Carolina)

> Compared to the use of indexes to skip whole table scanning, this
> optimization is going to have very little impact. So don't worry about
> it.

Well, if you were able to extract a long enough common prefix to make an
index optimization possible/useful, then it would have some value. But
that seems unlikely. What I think would be considerably more
interesting is a conversion to an OR form:
state ~ '(^Nebraska)|(^Nevada)|(^North Carolina)'
to
state ~ '^Nebraska' OR state ~ '^Nevada' OR state ~ '^North Carolina'

which could be planned as three separate, very-selective indexscans ---
unlike the rewritten version proposed above.

But Oleg's suggestion of using pg_trgm or some other full-text searching
mechanism is probably at least as good, and it requires no new coding.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Gurjeet Singh 2006-12-26 15:09:01 Re: NEED URGENT HELP....
Previous Message Shoaib Mir 2006-12-26 15:06:38 Re: Clustering & Load Balancing & Replication