Quick Links

Re: BUG #3730: Creating a swedish dictionary fails

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc:	penty(dot)wenngren(at)dgc(dot)se, pgsql-bugs(at)postgresql(dot)org, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject:	Re: BUG #3730: Creating a swedish dictionary fails
Date:	2007-11-09 18:49:27
Message-ID:	13391.1194634167@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> I am wondering if the newline being included in the token could be
> causing a problem.

Nope. I traced through it and the problem is that char2wchar() is
completely brain-dead: at some places it thinks that "len" is the
length of the output wchar array, and at others it thinks that "len"
is the number of bytes in the input. In particular, _t_isalpha()
fails completely for any multibyte character, because the pnstrdup
call truncates the character to 1 byte.

After looking at the callers I'm inclined to think that the only
safe way to implement this routine is to change its API to provide
both counts. Comments?

regards, tom lane

In response to

Re: BUG #3730: Creating a swedish dictionary fails at 2007-11-09 13:56:02 from Alvaro Herrera

Responses

Re: BUG #3730: Creating a swedish dictionary fails at 2007-11-09 19:10:15 from Alvaro Herrera

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Alvaro Herrera	2007-11-09 19:10:15	Re: BUG #3730: Creating a swedish dictionary fails
Previous Message	Heikki Linnakangas	2007-11-09 17:24:31	Re: BUG #3737: lower/upper fails to match extended chars in LATIN1