From: | Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru> |
---|---|
To: | obartunov(at)gmail(dot)com, Wolfgang Winkler <wolfgang(dot)winkler(at)digital-concepts(dot)com> |
Cc: | Postgres General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Using a german affix file for compound words |
Date: | 2016-01-28 16:34:46 |
Message-ID: | 56AA4326.7030503@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 28.01.2016 18:57, Oleg Bartunov wrote:
>
>
> On Thu, Jan 28, 2016 at 6:04 PM, Wolfgang Winkler
> <wolfgang(dot)winkler(at)digital-concepts(dot)com
> <mailto:wolfgang(dot)winkler(at)digital-concepts(dot)com>> wrote:
>
> Hi!
>
> We have a problem with importing a compound dictionary file for german.
>
> I downloaded the files here:
>
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/dicts/ispell/ispell-german-compound.tar.gz
>
> and converted them to utf-8 with iconv. The affix file seems ok when
> opened with an editor.
>
> When I try to create or alter a dictionary to use this affix file, I
> get the following error:
>
> alter TEXT SEARCH DICTIONARY german_ispell (
> DictFile = german,
> AffFile = german,
> StopWords = german
> );
> ERROR: syntax error
> CONTEXT: line 224 of configuration file
> "/usr/local/pgsql/share/tsearch_data/german.affix": " ABE > -ABE,äBIN
> "
>
> This is the first occurrence of an umlaut character in the file.
> I've found a view postings where the same file is used, e.g.:
>
> http://www.postgresql.org/message-id/flat/556C1411(dot)4010608(at)tbz-pariv(dot)de#556C1411(dot)4010608@tbz-pariv.de
>
> This users has been able to import the file. Am I missing something
> obvious?
>
What version of PostgreSQL do you use?
I tested this dictionary on PostgreSQL 9.4.5. Downloaded from the link
files and executed commands:
iconv -f ISO-8859-1 -t UTF-8 german.aff -o german2.affix
iconv -f ISO-8859-1 -t UTF-8 german.dict -o german2.dict
I renamed them to german.affix and german.dict and moved to the
tsearch_data directory. Executed commands without errors:
-> create text search dictionary german_ispell (
Template = ispell,
DictFile = german,
AffFile = german,
Stopwords = german
);
DROP TEXT SEARCH DICTIONARY
-> select ts_lexize('german_ispell', 'test');
ts_lexize
-----------
{test}
(1 row)
--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2016-01-28 16:55:33 | Re: Request - repeat value of \pset title during \watch interations |
Previous Message | Oleg Bartunov | 2016-01-28 15:57:15 | Re: Using a german affix file for compound words |