Re: best practise/pattern for large OR / LIKE searches

From: Jasen Betts <jasen(at)xnet(dot)co(dot)nz>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: best practise/pattern for large OR / LIKE searches
Date: 2009-08-30 12:00:27
Message-ID: h7dpkr$7ld$1@reversiblemaps.ath.cx
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 2009-08-26, Ries van Twisk <pg(at)rvt(dot)dds(dot)nl> wrote:
>
> --Apple-Mail-1173-222712773
> Content-Type: text/plain;
> charset=US-ASCII;
> format=flowed;
> delsp=yes
> Content-Transfer-Encoding: 7bit
>
> Hey All,
>
> I am wondering if there is a common pattern for these sort of queries :
>
> SELECT * FROM tbl WHERE datanumber LIKE '%12345%' OR LIKE '%54321%' OR
> LIKE '%8766%' OR LIKE '%009%', ..

SELECT * FROM tbl WHERE datanumber LIKE ANY ARRAY('%12345%','%54321%','%8766%'...)

> The number of OR/LIKES are in the order of 50-100 items...
> the table tbl is a couple of million rows.

regex might perfrom better than LIKE ANY

SELECT * FROM tbl WHERE '12345|54321|8766|009' ~ datanumber;

regex is compiled to a finite state machine and then the datanumber
column is scanned in a single pass (for each row)

> Searches are currently taking to long and we would like to optimize
> them, but before we dive into our own solution we
> where wondering if there already common solutions for this...

try regex first if that's too slow you may need to write a
dictionary function that splits datanuimber into it's components
and use full text index/search. (this will slow down updates as they will do
upto 20 inserts into the index)

searches should then be optimally fast

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephen Cuppett 2009-08-30 12:11:13 Trouble using TG_TABLE_NAME in BEFORE INSERT OR UPDATE trigger
Previous Message Eric Comeau 2009-08-30 11:21:39 Re: New database or New Schema?