Re: [SQL] Techniques for quickly finding words in a phrase...

From: om <o(at)mueschke(dot)de>
To: "Saltsgaver, Scott" <scottsa(at)aiinet(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: [SQL] Techniques for quickly finding words in a phrase...
Date: 2000-02-12 08:30:27
Message-ID: 20000212093027.A14390@ompc3.dom.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Fri, Feb 11, 2000 at 06:08:43PM -0500, Tom Lane wrote:
> "Saltsgaver, Scott" <scottsa(at)aiinet(dot)com> writes:
> > SELECT p.id, phrase FROM Phrase AS p, PhraseWords AS pw
> > WHERE
> > ((p.id = pw.id) AND word LIKE 'WAS%')
> > AND EXISTS (SELECT id FROM PhraseWords AS pw
> > WHERE (p.id = pw.id) AND word LIKE 'WHIT%')
> > AND EXISTS (SELECT id FROM PhraseWords AS pw
> > WHERE (p.id = pw.id) AND word LIKE 'SNOW%');
>
> > For some reason, the select still takes > 1 minute on a fairly decent
> > sized Linux box (500Mhz, 128MB ram).
>
> Subselects are pretty inefficient in Postgres at present. Try rewriting
> it as a join:
>
> SELECT p.id, phrase FROM Phrase AS p, PhraseWords AS pw1,
> PhraseWords AS pw2, PhraseWords AS pw3
> WHERE p.id = pw1.id AND pw1.word LIKE 'WAS%'
> AND p.id = pw2.id AND pw2.word LIKE 'WHIT%'
> AND p.id = pw3.id AND pw3.word LIKE 'SNOW%';

another approach would leave the PhraseWords table aside and use regular
expressions to find matches in table Phrase. of course, this couldn't take
advantage of indices, but maybe the fact that it avoids the join (or
subselect) helps performance.

SELECT id, phrase FROM Phrase
WHERE phrase ~* '[[:<:]]was'
AND phrase ~* '[[:<:]]whit'
AND phrase ~* '[[:<:]]snow';

-- oliver

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Peter Stamfest 2000-02-12 23:57:04 text -> char
Previous Message phil 2000-02-12 04:29:06 Type casting bool?