Re: BUG #3730: Creating a swedish dictionary fails

From: Penty Wenngren <penty(dot)wenngren(at)dgc(dot)se>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #3730: Creating a swedish dictionary fails
Date: 2007-11-09 00:44:49
Message-ID: 20071109004449.GA65896@picard.dgc.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Nov 08, 2007 at 05:21:17PM -0500, Tom Lane wrote:
> Penty Wenngren <penty(dot)wenngren(at)dgc(dot)se> writes:
> > I used iconv to convert svenska.aff and svenska.datalist (from
> > iswedish-1.2.1) to UTF-8. The converted files can be found at:
> > http://www.lederhosen.org/swedish.affix
> > http://www.lederhosen.org/swedish.dict
>
> I think the reason it's failing right there is that that line is the
> first affix rule containing a non-ASCII letter, and the rules are
> supposed to only contain letters and certain specific punctuation.
> I suspect you are working in a locale that doesn't think Ö is a
> letter --- check lc_ctype.
>

It doesn't seem to make any difference. The first try was done from a
terminal that didn't care much for UTF-8, but that is fixed now and I
still get the same result. Could it be that iconv's conversion is
broken then, or that I did something terribly wrong in the conversion
process (iconv -f ISO-8859-1 -t UTF-8 svenska.aff > swedish.affix)?

$ echo $LANG
sv_SE.UTF-8

$ echo $LC_CTYPE
sv_SE.UTF-8

$ psql test
Välkommen till psql 8.3beta2, den interaktiva PostgreSQL-terminalen.

Skriv: \copyright för upphovsrättsinformation
\h för hjälp om SQL-kommandon
\? för hjälp om psql-kommandon
\g eller avsluta med semikolon för att köra en fråga
\q för att avsluta

test=# CREATE TEXT SEARCH DICTIONARY swedish_ispell (
TEMPLATE = ispell,
DictFile = swedish,
AffFile = swedish,
StopWords = swedish);
FEL: syntax error at line 219 of affix file
"/usr/local/share/postgresql/tsearch_data/swedish.affix"

I also tried to convert the file again, this time from a terminal that
likes UTF8 thinking that might have an effect, but the affix file looks
the same.

I found a post in the archives regarding a similar problem:
http://archives.postgresql.org/pgsql-hackers/2007-08/msg00825.php

It seems editing the affix file and manually removing some lines at
least partially solved the problem in that case.

// Penty

--

Penty Wenngren
DGC Solutions AB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2007-11-09 01:45:32 Re: BUG #3730: Creating a swedish dictionary fails
Previous Message Tom Lane 2007-11-09 00:10:50 Re: BUG #3723: dropping an index that doesn't refer to table's columns