Re: BUG #3730: Creating a swedish dictionary fails

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: penty(dot)wenngren(at)dgc(dot)se, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #3730: Creating a swedish dictionary fails
Date: 2007-11-09 13:56:02
Message-ID: 20071109135602.GE2768@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tom Lane wrote:
> Penty Wenngren <penty(dot)wenngren(at)dgc(dot)se> writes:
> > I used iconv to convert svenska.aff and svenska.datalist (from
> > iswedish-1.2.1) to UTF-8. The converted files can be found at:
> > http://www.lederhosen.org/swedish.affix
> > http://www.lederhosen.org/swedish.dict
>
> I think the reason it's failing right there is that that line is the
> first affix rule containing a non-ASCII letter, and the rules are
> supposed to only contain letters and certain specific punctuation.
> I suspect you are working in a locale that doesn't think Ö is a
> letter --- check lc_ctype.

I patched parse_affentry to report the current token and I see this:

alvherre=# CREATE TEXT SEARCH DICTIONARY swedish_ispell (
TEMPLATE = ispell,
DictFile = swedish,
AffFile = swedish,
StopWords = swedish);
ERROR: syntax error at line 149 (str: "örs
") of affix file "/home/alvherre/Code/CVS/pgsql/install/00orig/share/tsearch_data/swedish.affix"

I am wondering if the newline being included in the token could be
causing a problem.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Pedro Gimeno 2007-11-09 14:23:13 Revisiting BUG #3684: After dump/restore, schema PUBLIC always exists
Previous Message Magnus Hagander 2007-11-09 13:29:43 Re: BUG #3730: Creating a swedish dictionary fails