Re: UPPER()/LOWER() and UTF-8

From: Karel Zak <zakkr(at)zf(dot)jcu(dot)cz>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alexey Mahotkin <alexm(at)w-m(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UPPER()/LOWER() and UTF-8
Date: 2003-11-05 09:29:01
Message-ID: 20031105092901.GA20271@zf.jcu.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 04, 2003 at 04:52:33PM -0500, Tom Lane wrote:
> Alexey Mahotkin <alexm(at)w-m(dot)ru> writes:
> > I'm running Postgresql 7.3.4 with ru_RU.UTF-8 locale (with UNICODE
> > database encoding), and all is almost well, except that UPPER() and
> > LOWER() seem to ignore locale.
>
> upper/lower aren't going to work desirably in any multi-byte character
> set encoding. I think Peter E. is looking into what it would take to

It's a PostgreSQL and no UTF problem, because standard PostgreSQL text
functions doesn't know something about arguments encoding and for this
functions cannot use another (an example UTF's lower/upper) method for
a work with strings.

Maybe a little extend internal "text" datatype and like VARSIZE() use
VARENCODING(). Maybe Peter already has some better idea.

> fix this for 7.5, but at present you are going to need to use a
> single-byte encoding within the server. (Nothing to stop you from using
> UTF-8 on the client side though.)

You can use mutibyte on server side too, but you must to use for
example convert() function for upper/lower arguments.

Karel

--
Karel Zak <zakkr(at)zf(dot)jcu(dot)cz>
http://home.zf.jcu.cz/~zakkr/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexey Mahotkin 2003-11-05 09:41:33 Re: UPPER()/LOWER() and UTF-8
Previous Message Neil Conway 2003-11-05 09:24:45 Re: equal() perf tweak