Re: BUG #4257: about unicode extend

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "arli weng" <program(at)163(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4257: about unicode extend
Date: 2008-06-21 15:55:21
Message-ID: 19196.1214063721@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

"arli weng" <program(at)163(dot)com> writes:
> the command (chinese by utf-8):
> INSERT INTO "title" VALUES(46307243,46307898,'');
> in postgres report error:
> invalid byte sequence for encoding "UNICODE": 0xf0

I don't believe this is actually an 8.3 server. In 8.1 or later that
encoding would be referred to as "UTF8"; also, 8.1 and later would show
all bytes of the complained-of character not just the first one.

8.0 and before only support 16-bit Unicode code points (ie, 3-byte
utf8 sequences). We have support for 4-byte sequences in 8.1 and
later. Also, there were some fixes in this area in Jan 2007, so
whichever branch you use, make sure you get a minor release that's
newer than that.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Fuhr 2008-06-21 16:11:05 Re: BUG #4257: about unicode extend
Previous Message ArLi 2008-06-21 13:26:54 bug