TSearch2: Problems with compound words and stop words

From: Timo Haberkern <thaberkern(at)emedia-office(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: TSearch2: Problems with compound words and stop words
Date: 2004-11-05 07:30:28
Message-ID: 418B2C14.2010200@emedia-office.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi there,

i have some troubles with my TSearch2 Installation. I have done this
installation as described in

http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words <http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>

I used the german myspell dictionary from
http://lingucomponent.openoffice.org/spell_dic.html and converted it with
my2ispell

Nearly everything is working fine so far, except two problems:

1.) The stopword-file seems to be ignored: If i try it with SELECT
to_tsvector("default_german", "ein Haus") i get

"ein":1 "haus":2

ein should be a Stopword for german (and is defined the german.stop file as
well)


2.) The compound words feature doesn"t work too. I have tried a lot of words,
i.e. "Fehlermeldung" with SELECT to_tsvector("default_german", "Fehlermeldung")
i only get
"fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated
entries. Is there anything wrong with the dictonary or my configuration?


My current configuration:

pg_ts_cfg:

default default C
default_russian default ru_RU.KOI8-R
simple default NULL
default_german default de_DE.ISO8859-1

pg_ts_cfgmap:

default_german host {simple}
default_german hword {simple}
default_german int {simple}
default_german nlhword {simple}
default_german nlpart_hword {simple}
default_german nlword {simple}
default_german part_hword {simple}
default_german sfloat {simple}
default_german uint {simple}
default_german uri {simple}
default_german url {simple}
default_german version {simple}
default_german word {simple}
default_german lpart_hword {de_ispell,german_snowball}
default_german lword {de_ispell,german_snowball}
default_german lhword {de_ispell,german_snowball}


pg_ts_dict:

de_ispell | 17166 |
DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" | 17167 | NULL
german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german




Can anyone help me?

regards

Timo

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Oleg Bartunov 2004-11-05 08:04:15 Re: TSearch2: Problems with compound words and stop words
Previous Message Jon Bell 2004-11-05 06:06:15 Re: RFD