Quick Links

How to get around LIKE inefficiencies?

From:	The Hermit Hacker <scrappy(at)hub(dot)org>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	How to get around LIKE inefficiencies?
Date:	2000-11-06 01:03:38
Message-ID:	Pine.BSF.4.21.0011052045440.494-100000@thelab.hub.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I'm tryin to figure out how to speed up udmsearch when run under
postgresql, and am being hit by atrocious performance when using a LIKE
query ... the query looks like:

SELECT ndict.url_id,ndict.intag
FROM ndict,url
WHERE ndict.word_id=1971739852
AND url.rec_id=ndict.url_id
AND (url.url LIKE 'http://www.postgresql.org/%');

Take off the AND ( LIKE ) part of the query, finishes almost as soon as
you hit return. Put it back in, and you can go for coffee before it
finishes ...

If I do 'SELECT url_id FROM ndict WHERE word_id=1971739852', there
are 153 records returned ... is there some way, that I'm not thinking, of
re-writing the above so that it 'resolves' the equality before the LIKE in
order to reduce the number of tuples that it has to do the LIKE on? Is
there some way of writing the above so that it doesn't take forever to
execute?

I'm running this on a Dual-PIII 450 Server, 512Meg of RAM, zero
swap space being used ... the database has its indices on one hard drive,
the tables themselves are on a second one ... its PgSQL 7.0.2 (Tom,
anything in v7.0.3 that might improve this?) and startup is as:

#!/bin/tcsh
setenv PORT 5432
setenv POSTMASTER /pgsql/bin/postmaster
unlimit
${POSTMASTER} -B 384 -N 192 \
-o "-F -S 32768" \
-i -p ${PORT} -D/pgsql/data >&
/pgsql/logs/postmaster.${PORT}.$$ &

So its not like I'm not throwing alot of resources at this ...

Is there anything that we can do to improve this? I was trying to
think of some way to use a subselect to narrow the search results, or
something ...

Oh, the above explains down to:

NOTICE: QUERY PLAN:

Nested Loop (cost=0.00..1195.14 rows=1 width=10)
-> Index Scan using url_url on url (cost=0.00..2.73 rows=1 width=4)
-> Index Scan using n_word on ndict (cost=0.00..1187.99 rows=353 width=6)

EXPLAIN

ndict: 663018 tuples
url: 29276 tuples

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy(at)hub(dot)org secondary: scrappy(at){freebsd|postgresql}.org

Responses

Re: How to get around LIKE inefficiencies? at 2000-11-06 02:28:04 from Tom Lane
Re: How to get around LIKE inefficiencies? at 2000-11-06 03:19:21 from Ron Chmara
Re: How to get around LIKE inefficiencies? at 2000-11-07 01:11:41 from Tatsuo Ishii

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tatsuo Ishii	2000-11-06 01:05:52	Re: Unicode conversion (Re: [COMMITTERS] pgsql (configure.in))
Previous Message	Hiroshi Inoue	2000-11-06 01:00:29	ResetSystemCaches(was Re: relation ### modified while in use)