Quick Links

efficient pattern queries (using LIKE, ~)

From:	Fernan Aguero <fernan(at)unsam(dot)edu(dot)ar>
To:	PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject:	efficient pattern queries (using LIKE, ~)
Date:	2007-09-11 18:54:42
Message-ID:	20070911185442.GA38153@iib.unsam.edu.ar
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Hi,

I have a table containing some ~13 million rows. Queries on
indexed fields run fast, but unanchored pattern queries on a
text column are slooooow. Indexing the column doesn't help
(this is already mentioned in the manual).
http://www.postgresql.org/docs/8.2/interactive/indexes-types.html

However, no alternative solution is mentioned for indexing
and/or optimizing queries based on unanchored patterns:
i.e. description LIKE '%kinase%'.

I've already searched the archives, read the manual, googled
around and the only alternative I've found is: full text
indexing (tsearch2 in postgresql-contrib; OpenFTS; others?)

But do note that i) I'm not interested in finding results 'similar to'
the query term (and ranked based on similarity) but just
results 'containing an exact substring' of the query term ...
i.e. not the original goal of a full text search

And, ii) from what I've read it seems that for both tsearch2
and OpenFTS the queries have to be rewritten to explicitly
evaluate the pattern on the special indices, i.e. they're
not transparently available (i.e. via the query planner),

I'm hoping for something like:
CREATE INDEX newindex ON table USING fti (column);

and then having the new index automagically used by the
planner in cases like:
SELECT * FROM table WHERE column LIKE '%magic%';

If there's anything like this, I've failed at finding it ...

Thanks for any pointer,

Fernan

PS: additional information

This is on PostgreSQL-8.2.4, FreeBSD-6.2 (amd64).

EXPLAIN ANALYZE SELECT COUNT(*) FROM dots.transcript WHERE product LIKE '%kinase%'; QUERY PLAN QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=651878.85..651878.86 rows=1 width=0) (actual time=45587.244..45587.246 rows=1 loops=1)
-> Seq Scan on nafeatureimp (cost=0.00..651878.85 rows=1 width=0) (actual time=33.049..45582.628 rows=2255 loops=1)
Filter: (((subclass_view)::text = 'Transcript'::text) AND ((string13)::text ~~ '%kinase%'::text))
Total runtime: 45589.892 ms
(4 rows)

Responses

Re: efficient pattern queries (using LIKE, ~) at 2007-09-11 19:08:37 from Mario Weilguni

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Mario Weilguni	2007-09-11 19:08:37	Re: efficient pattern queries (using LIKE, ~)
Previous Message	Jean-David Beyer	2007-09-11 18:49:28	Re: random_page_costs - are defaults of 4.0 realistic for SCSI RAID 1