Re: Speaking of Indexing... (Text indexing)

From: Andrew McMillan <andrew(at)catalyst(dot)net(dot)nz>
To: Poet/Joshua Drake <poet(at)linuxports(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speaking of Indexing... (Text indexing)
Date: 2001-04-11 05:35:33
Message-ID: 3AD3ED25.6EAAE750@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Poet/Joshua Drake wrote:
>
> Good day,
>
> I've been experimenting a bit with Full Text Indexing in PostgreSQL. I
> have found several conflicting sites various places on the net pertaining
> to whether or not PostgreSQL supports FTI, and I was hoping I could find
> an authoritative answer here - I tried searching the website's archives,
> but the search seems to be having some problems.
>
> At any rate, I am running a CVS snapshot of 7.1, and I have been trying to
> create a full text index on a series of resumes. Some of these exceed 8k
> in size, which is no longer a storage problem of course with 7.1, but I
> seem to have run into the wicked 8k once again. Specifically:
>
> ERROR: index_formtuple: data takes 9344 bytes, max is 8191
>
> Furthermore, after trying to just index on a 8191-character long substring
> of the resume, I run into the following:
>
> ERROR: btree: index item size 3948 exceeds maximum 2713
>
> The only way I could actually get the index created was to substring the
> body of the resumes down to 2k. I also later tried using HASH rather than
> BTREE, which worked, but none of these solutions really appreciably
> increased performance in the way we were hoping.
>
> Are these known and accepted limitations of the current 7.1
> implementation, or am I doing something terribly wrong? ;)
> On Tue, 10 Apr 2001, Thomas Lockhart wrote:

You need to use the 'contrib' code for full-text indexing. The indexing you are
trying to do with that is just using the whole content of the string as the index
value. Close to useless.

The contrib code is in contrib/fulltextindex.

I have a hacked version of that which changes it to keyword indexing, if you're
interested.

Regards,
Andrew.
--
_____________________________________________________________________
Andrew McMillan, e-mail: Andrew(at)catalyst(dot)net(dot)nz
Catalyst IT Ltd, PO Box 10-225, Level 22, 105 The Terrace, Wellington
Me: +64 (21) 635 694, Fax: +64 (4) 499 5596, Office: +64 (4) 499 2267

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Francis K Shim 2001-04-11 07:00:12 Given: Win98, Cygwin, postgresql-7.0.3.tar.gz | Need: to make libpg/(++) libraries?
Previous Message Steven D. Arnold 2001-04-11 04:25:43 plpgsql -- arrays/temporary tables?

Browse pgsql-hackers by date

  From Date Subject
Next Message Rainer Mager 2001-04-11 06:58:34 RE: Problem with 7.0.3 dump -> 7.1b4 restore
Previous Message Mark Butler 2001-04-11 03:11:52 Re: Extensible mechanism for type promotion / demotion