About upper() and lower to handle multibyte char

From: Weiping <laser(at)qmail(dot)zhengmai(dot)net(dot)cn>
To: pgsql-general(at)postgresql(dot)org
Subject: About upper() and lower to handle multibyte char
Date: 2004-10-19 13:52:13
Message-ID: 41751C0D.2020209@qmail.zhengmai.net.cn
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

while upgrade to 8.0 (beta3) we got some problem:

we have a database which encoding is UNICODE,
when we do queries like:
select upper('中文'); --select some multibyte character,
then postgresql response:

ERROR: invalid multibyte character for locale

but when we do it in a SQL_ASCII encoding database,
it's ok and return unchanged string, that's what we think correct result.

I've searched the archive and found that in 8.0, the upper()/lower()
function have been changed to could handle multibyte character,
but, what's the expected behavior of these two function in coping with
multibyte character?

Another question: from the archive, I know that on system with
<wctype.h> toupper/tolower functions, the postgresql would support
multibyte upper/lower function; my system (slackware 10) got <wctype.h>,
but why still I get the ERROR? How can I check if my postgresql installation
come with multibyte upper/lower support?

The problem make us very difficlut when using upper/lower to deal with
columns with more then one encoding char, like Chinese and English char
in Unicode
database, because the transaction would abort with the error above, that
breaks
our application a lot.

Thanks and any help would be appreciated

Laser

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Taber, Mark 2004-10-19 14:29:03 " CLI describe error: Out of memory while reading tuples."
Previous Message Ed Stoner 2004-10-19 13:25:05 Re: Numeric user names