Quick Links

Re: LIKE search and performance

From:	PFC <lists(at)peufeu(dot)com>
To:	"Mark Lewis" <mark(dot)lewis(at)mir3(dot)com>, "James Mansion" <james(at)mansionfamily(dot)plus(dot)com>
Cc:	"Magnus Hagander" <magnus(at)hagander(dot)net>, "Alexander Staubo" <alex(at)purefiction(dot)net>, Andy <frum(at)ar-sd(dot)net>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: LIKE search and performance
Date:	2007-05-24 22:09:15
Message-ID:	op.tsuqhpzycigqcu@apollo13
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

> PG could scan the index looking for matches first and only load the
> actual rows if it found a match, but that could only be a possible win
> if there were very few matches, because the difference in cost between a
> full index scan and a sequential scan would need to be greater than the
> cost of randomly fetching all of the matching data rows from the table
> to look up the visibility information.

If you need to do that kind of thing, ie. seq scanning a table checking
only one column among a large table of many columns, then don't use an
index. An index, being a btree, needs to be traversed in order (or else, a
lot of locking problems come up) which means some random accesses.

So, you could make a table, with 2 columns, updated via triggers : your
text field, and the primary key of your main table. Scanning that would be
faster.

Still, a better solution for searching in text is :

- tsearch2 if you need whole words
- trigrams for any substring match
- xapian for full text search with wildcards (ie. John* = Johnny)

Speed-wise those three will beat any seq scan on a large table by a huge
margin.

In response to

Re: LIKE search and performance at 2007-05-24 21:02:40 from Mark Lewis

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Richard Huxton	2007-05-25 08:13:25	Re: LIKE search and performance
Previous Message	Craig James	2007-05-24 22:08:16	Re: LIKE search and performance