Skip site navigation (1) Skip section navigation (2)

Re: LIKE search and performance

From: PFC <lists(at)peufeu(dot)com>
To: "Richard Huxton" <dev(at)archonet(dot)com>, mark(at)mark(dot)mielke(dot)cc
Cc: "Mark Lewis" <mark(dot)lewis(at)mir3(dot)com>, "James Mansion" <james(at)mansionfamily(dot)plus(dot)com>, "Magnus Hagander" <magnus(at)hagander(dot)net>, "Alexander Staubo" <alex(at)purefiction(dot)net>, Andy <frum(at)ar-sd(dot)net>, pgsql-performance(at)postgresql(dot)org
Subject: Re: LIKE search and performance
Date: 2007-05-25 16:51:04
Message-ID: op.tsv6fevgcigqcu@apollo13 (view raw or flat)
Thread:
Lists: pgsql-performance

> OK - any application that allows user-built queries: <choose column:  
> foo> <choose filter: contains> <choose target: "bar">
>
> Want another? Any application that has a "search by name" box - users  
> can (and do) put one letter in and hit enter.
>
> Unfortunately you don't always have control over the selectivity of  
> queries issued.

	-*- HOW TO MAKE A SEARCH FORM -*-

	Imagine you have to code the search on IMDB.

	This is what a smart developer would do

	First, he uses AJAX autocompletion, so the thing is reactive.
	Then, he does not bother the user with a many-fields form. Instead of  
forcing the user to think (users HATE that), he writes smart code.
	Does Google Maps have separate fields for country, city, street, zipcode  
? No. Because Google is about as smart as it gets.

	So, you parse the user query.

	If the user types, for instance, less than 3 letters (say, spi), he  
probably wants stuff that *begins* with those letters. There is no point  
in searching for the letter "a" in a million movie titles database.
	So, if the user types "spi", you display "name LIKE spi%", which is  
indexed, very fast. And since you're smart, you use AJAX. And you display  
only the most popular results (ie. most clicked on).

http://imdb.com/find?s=all&q=spi

	Since 99% of the time the user wanted "spiderman" or "spielberg", you're  
done and he's happy. Users like being happy.
	If the user just types "a", you display the first 10 things that start  
with "a", this is useless but the user will marvel at your AJAX skillz.  
Then he will probably type in a few other letters.

	Then, if the user uses his space bar and types "spi 1980" you'll  
recognize a year and display spielberg's movies in 1980.
	Converting your strings to phonetics is also a good idea since about 0.7%  
of the l33T teenagers can spell stuff especially spiElberg.

	Only the guy who wants to know who had sex with marilyn monroe on the  
17th day of the shooting of Basic Instinct will need to use the Advanced  
search.

	If you detect several words, then switch to a prefix-based fulltext  
search like Xapian which utterly rocks.
	Example : the user types "savin priv", you search for "savin*" NEAR  
"priv*" and you display "saving private ryan" before he has even finished  
typing the second word of his query. Users love that, they feel  
understood, they will click on your ads and buy your products.

	In all cases, search results should be limited to less than 100 to be  
easy on the database. The user doesn't care about a search returning more  
than 10-20 results, he will just rephrase the query, and the time taken to  
fetch those thousands of records with name LIKE '%a%' will have been  
utterly lost. Who goes to page 2 in google results ?

	BOTTOM LINE : databases don't think, you do.

In response to

Responses

pgsql-performance by date

Next:From: markDate: 2007-05-25 16:56:33
Subject: Re: LIKE search and performance
Previous:From: Peter T. BreuerDate: 2007-05-25 16:34:36
Subject: Re: general PG network slowness (possible cure) (repost)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group