Re: Unicode problems on IRC

From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: andrew(at)supernews(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unicode problems on IRC
Date: 2005-04-10 23:40:59
Message-ID: 4259B98B.6030509@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:

> Yeah? Cool. Does John's proposed patch do it "correctly"?
>
> http://candle.pha.pa.us/mhonarc/patches2/msg00076.html

Some comments on that patch:

Doesn't pg_utf2wchar_with_len need changes for the longer sequences?

UtfToLocal also appears to need changes.

If we support sequences >4 bytes (>U+10FFFF), then UtfToLocal/LocalToUtf
and the associated translation tables need a redesign as they currently
assume the sequence fits in an unsigned int. (IIRC, Unicode doesn't use
>U+10FFFF, but UTF-8 can encode it?)

-O

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2005-04-11 01:12:41 Re: Compressing WAL
Previous Message Jim C. Nasby 2005-04-10 23:10:29 System vs non-system casts