Re: Using a german affix file for compound words

From: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
To: obartunov(at)gmail(dot)com, Wolfgang Winkler <wolfgang(dot)winkler(at)digital-concepts(dot)com>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Using a german affix file for compound words
Date: 2016-01-28 16:34:46
Message-ID: 56AA4326.7030503@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 28.01.2016 18:57, Oleg Bartunov wrote:
>
>
> On Thu, Jan 28, 2016 at 6:04 PM, Wolfgang Winkler
> <wolfgang(dot)winkler(at)digital-concepts(dot)com
> <mailto:wolfgang(dot)winkler(at)digital-concepts(dot)com>> wrote:
>
> Hi!
>
> We have a problem with importing a compound dictionary file for german.
>
> I downloaded the files here:
>
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/dicts/ispell/ispell-german-compound.tar.gz
>
> and converted them to utf-8 with iconv. The affix file seems ok when
> opened with an editor.
>
> When I try to create or alter a dictionary to use this affix file, I
> get the following error:
>
> alter TEXT SEARCH DICTIONARY german_ispell (
> DictFile = german,
> AffFile = german,
> StopWords = german
> );
> ERROR: syntax error
> CONTEXT: line 224 of configuration file
> "/usr/local/pgsql/share/tsearch_data/german.affix": " ABE > -ABE,äBIN
> "
>
> This is the first occurrence of an umlaut character in the file.
> I've found a view postings where the same file is used, e.g.:
>
> http://www.postgresql.org/message-id/flat/556C1411(dot)4010608(at)tbz-pariv(dot)de#556C1411(dot)4010608@tbz-pariv.de
>
> This users has been able to import the file. Am I missing something
> obvious?
>

What version of PostgreSQL do you use?

I tested this dictionary on PostgreSQL 9.4.5. Downloaded from the link
files and executed commands:

iconv -f ISO-8859-1 -t UTF-8 german.aff -o german2.affix
iconv -f ISO-8859-1 -t UTF-8 german.dict -o german2.dict

I renamed them to german.affix and german.dict and moved to the
tsearch_data directory. Executed commands without errors:

-> create text search dictionary german_ispell (
Template = ispell,
DictFile = german,
AffFile = german,
Stopwords = german
);
DROP TEXT SEARCH DICTIONARY

-> select ts_lexize('german_ispell', 'test');
ts_lexize
-----------
{test}
(1 row)

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2016-01-28 16:55:33 Re: Request - repeat value of \pset title during \watch interations
Previous Message Oleg Bartunov 2016-01-28 15:57:15 Re: Using a german affix file for compound words