Re: BUG #3638: UTF8 Character encoding does NOT work

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: fil(at)internetmediapro(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #3638: UTF8 Character encoding does NOT work
Date: 2007-09-27 07:57:38
Message-ID: 20070927.165738.113091771.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Why do you think that an UTF-8 encoded string starting with 0x92 is
valid?

0x92 can appear in the second, third or fourth octet, but should never
appear in the first octet.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> The following bug has been logged online:
>
> Bug reference: 3638
> Logged by: Fil Matthews
> Email address: fil(at)internetmediapro(dot)com
> PostgreSQL version: 8-1 , 8-2
> Operating system: Linux Debian - Windows XP
> Description: UTF8 Character encoding does NOT work
> Details:
>
> Judging from the amount of Google page hits with the exact same problem I am
> surprised and mystified by this obvious flaw in Postgres Technology..
>
> Just how is one expected to work with UTF8 character sets when all and
> every attempt at using even Postgres clients produces the SAME problem
> every time ???
>
> "invalid byte sequence for encoding "UTF8": 0x92"
>
> In Short A Postgres UTF8 database .. PGCLIENENCODING=UTF8
>
> Tables test.text -> (Chararcter varying 10)
>
> In any Postgres Client ie psql , dbadmin III
>
> Insert into test values ( chr(146));;
>
>
> Query returned successfully: 1 rows affected, 32 ms execution time.
>
> copy test to '/tmp/testfile.txt';
>
>
> Query returned successfully: 1 rows affected, 15 ms execution time.
>
> copy test from '/tmp/testfile.txt';
>
>
> Come on are you serious?? .. Just how does one work with completly valid
> data that has an ascii 128 + value ??
>
> Currently this flaw make Postgres an un-useable database technology .. Or
> can some-one please explain this and a possible work around .. ??
>
> Thank You
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2007-09-27 07:59:45 Re: BUG #3638: UTF8 Character encoding does NOT work
Previous Message Fil Matthews 2007-09-26 23:39:45 BUG #3638: UTF8 Character encoding does NOT work