Skip site navigation (1) Skip section navigation (2)

Re: BUG #3730: Creating a swedish dictionary fails

From: Penty Wenngren <penty(dot)wenngren(at)dgc(dot)se>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #3730: Creating a swedish dictionary fails
Date: 2007-11-09 00:44:49
Message-ID: 20071109004449.GA65896@picard.dgc.se (view raw or flat)
Thread:
Lists: pgsql-bugs
On Thu, Nov 08, 2007 at 05:21:17PM -0500, Tom Lane wrote:
> Penty Wenngren <penty(dot)wenngren(at)dgc(dot)se> writes:
> > I used iconv to convert svenska.aff and svenska.datalist (from
> > iswedish-1.2.1) to UTF-8. The converted files can be found at:
> > http://www.lederhosen.org/swedish.affix
> > http://www.lederhosen.org/swedish.dict
> 
> I think the reason it's failing right there is that that line is the
> first affix rule containing a non-ASCII letter, and the rules are
> supposed to only contain letters and certain specific punctuation.
> I suspect you are working in a locale that doesn't think Ö is a
> letter --- check lc_ctype.
> 

It doesn't seem to make any difference. The first try was done from a
terminal that didn't care much for UTF-8, but that is fixed now and I
still get the same result. Could it be that iconv's conversion is
broken then, or that I did something terribly wrong in the conversion
process (iconv -f ISO-8859-1 -t UTF-8 svenska.aff > swedish.affix)?

$ echo $LANG
sv_SE.UTF-8

$ echo $LC_CTYPE
sv_SE.UTF-8

$ psql test
Välkommen till psql 8.3beta2, den interaktiva PostgreSQL-terminalen.

Skriv:  \copyright för upphovsrättsinformation
        \h för hjälp om SQL-kommandon
        \? för hjälp om psql-kommandon
        \g eller avsluta med semikolon för att köra en fråga
        \q för att avsluta

test=# CREATE TEXT SEARCH DICTIONARY swedish_ispell (
TEMPLATE = ispell,
DictFile = swedish,
AffFile = swedish,
StopWords = swedish);
FEL:  syntax error at line 219 of affix file
"/usr/local/share/postgresql/tsearch_data/swedish.affix"


I also tried to convert the file again, this time from a terminal that
likes UTF8 thinking that might have an effect, but the affix file looks
the same.

I found a post in the archives regarding a similar problem:
http://archives.postgresql.org/pgsql-hackers/2007-08/msg00825.php

It seems editing the affix file and manually removing some lines at
least partially solved the problem in that case.

// Penty

-- 

Penty Wenngren
DGC Solutions AB

In response to

Responses

pgsql-bugs by date

Next:From: Tom LaneDate: 2007-11-09 01:45:32
Subject: Re: BUG #3730: Creating a swedish dictionary fails
Previous:From: Tom LaneDate: 2007-11-09 00:10:50
Subject: Re: BUG #3723: dropping an index that doesn't refer to table's columns

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group