Re: Speaking of Indexing... (Text indexing)

From: Joel Burton <jburton(at)scw(dot)org>
To: Poet/Joshua Drake <poet(at)linuxports(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Speaking of Indexing... (Text indexing)
Date: 2001-04-11 02:13:08
Message-ID: Pine.LNX.4.21.0104102209020.31213-100000@olympus.scw.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Tue, 10 Apr 2001, Poet/Joshua Drake wrote:

> I've been experimenting a bit with Full Text Indexing in PostgreSQL. I
> have found several conflicting sites various places on the net pertaining
> to whether or not PostgreSQL supports FTI, and I was hoping I could find
> an authoritative answer here - I tried searching the website's archives,
> but the search seems to be having some problems.
>
> At any rate, I am running a CVS snapshot of 7.1, and I have been trying to
> create a full text index on a series of resumes. Some of these exceed 8k
> in size, which is no longer a storage problem of course with 7.1, but I
> seem to have run into the wicked 8k once again. Specifically:

Joshua --

CREATE INDEX ... creates an index on a field, allowing for faster
searches, *if* you're looking to match the first part of that text string.
So, if I have a table of movie titles, creating an index on column title
will allow for faster searches if my criteria is something like
title='Toto Les Heros' (or like 'Toto%' or such), but not (AFAIK) for
title ~ 'Les' or title LIKE '%Les%'. The index doesn't help here.

For these long fields you have, you probably want to search for a word in
the field, not match the start of the field. A regular index isn't your
answer.

There is a full text indexing solution in the contrib/ directory of the
source. It essentially creates a new table w/every occurence of every word
fragment, with a reference back to the row that contains it. Searching
against this is indexed, and is speedy. The only downside is that you will
have a *large* table holding the full text index.

More help can be found in the README file in contrib/fulltextindex

HTH,
--
Joel Burton <jburton(at)scw(dot)org>
Director of Information Systems, Support Center of Washington

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Joel Burton 2001-04-11 02:14:38 Re: newbie question - INSERT
Previous Message Brett W. McCoy 2001-04-11 01:47:03 RE: [GENERAL] perl dbi:pg

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Lockhart 2001-04-11 02:34:01 Re: Speaking of Indexing... (Text indexing)
Previous Message Homayoun Yousefi'zadeh 2001-04-10 23:38:28 Re: JDBC and Perl compiling problems w/ postgresql-7.1rc4