TSEARCH2 Thesaurus limitations

From: Theodore Wong <wongteddy40(at)hotmail(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: TSEARCH2 Thesaurus limitations
Date: 2008-10-21 04:41:50
Message-ID: BAY139-W5358935F9483D2D8BE25F8B22E0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I'm new to Postgres and would appreciate some help
in understanding what the limitations of TSEARCH2 and
the Thesauri operation.

I'm trying to use the thesaurus as a geo-tagger/coder.
The first part of the problem is to create placename
list with additional information such as state, county
and country names. But, the returned values are off.

There's less of a problem when the thesaurus is small
under 100 rows but I'm trying to upload 7 million rows.

I have not seen the latest TSEARCH2 code release so
I don't have a great deal of understanding of the inner
workings.

Is there specific code that I can hack which will remove
a fix limitation such as the number of tokens before the
indexer quits or is the index type insufficient for the scale
of data.

Thanks,

Ted
_________________________________________________________________
Want to read Hotmail messages in Outlook? The Wordsmiths show you how.
http://windowslive.com/connect/post/wedowindowslive.spaces.live.com-Blog-cns!20EE04FBC541789!167.entry?ocid=TXT_TAGLM_WL_hotmail_092008

Browse pgsql-hackers by date

  From Date Subject
Next Message ITAGAKI Takahiro 2008-10-21 04:56:01 pg_stat_statements in core
Previous Message Hitoshi Harada 2008-10-21 03:35:49 Re: Window Functions: buffering strategy