Skip site navigation (1) Skip section navigation (2)

Re: best practise/pattern for large OR / LIKE searches

From: Jasen Betts <jasen(at)xnet(dot)co(dot)nz>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: best practise/pattern for large OR / LIKE searches
Date: 2009-08-30 12:00:27
Message-ID: h7dpkr$7ld$ (view raw, whole thread or download thread mbox)
Lists: pgsql-general
On 2009-08-26, Ries van Twisk <pg(at)rvt(dot)dds(dot)nl> wrote:
> --Apple-Mail-1173-222712773
> Content-Type: text/plain;
> 	charset=US-ASCII;
> 	format=flowed;
> 	delsp=yes
> Content-Transfer-Encoding: 7bit
> Hey All,
> I am wondering if there is a common pattern for these sort of queries :
> SELECT * FROM tbl WHERE datanumber LIKE '%12345%' OR LIKE '%54321%' OR  
> LIKE '%8766%' OR LIKE '%009%', ..

SELECT * FROM tbl WHERE datanumber LIKE ANY ARRAY('%12345%','%54321%','%8766%'...)

> The number of OR/LIKES are in the order of 50-100 items...
> the table tbl is a couple of million rows.

regex might perfrom better than LIKE ANY

SELECT * FROM tbl WHERE '12345|54321|8766|009' ~ datanumber;

regex is compiled to a finite state machine and then the datanumber
column is scanned in a single pass (for each row)

> Searches are currently taking to long and we would like to optimize  
> them, but before we dive into our own solution we
> where wondering if there already common solutions for this...

try regex first if that's too slow you may need to write a
dictionary function that splits datanuimber into it's components 
and use full text index/search. (this will slow down updates as they will do
upto 20 inserts into the index)

searches should then be optimally fast

In response to


pgsql-general by date

Next:From: Stephen CuppettDate: 2009-08-30 12:11:13
Subject: Trouble using TG_TABLE_NAME in BEFORE INSERT OR UPDATE trigger
Previous:From: Eric ComeauDate: 2009-08-30 11:21:39
Subject: Re: New database or New Schema?

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group