> Tatsuo Ishii wrote:
> > Why do you think that an UTF-8 encoded string starting with 0x92 is
> > valid?
> > 0x92 can appear in the second, third or fourth octet, but should never
> > appear in the first octet.
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> >> The following bug has been logged online:
> >> Bug reference: 3638
> >> Logged by: Fil Matthews
> >> Email address: fil(at)internetmediapro(dot)com
> >> PostgreSQL version: 8-1 , 8-2
> >> Operating system: Linux Debian - Windows XP
> >> Description: UTF8 Character encoding does NOT work
> >> Details:
> >> Judging from the amount of Google page hits with the exact same problem I am
> >> surprised and mystified by this obvious flaw in Postgres Technology..
> >> Just how is one expected to work with UTF8 character sets when all and
> >> every attempt at using even Postgres clients produces the SAME problem
> >> every time ???
> >> "invalid byte sequence for encoding "UTF8": 0x92"
> >> In Short A Postgres UTF8 database .. PGCLIENENCODING=UTF8
> >> Tables test.text -> (Chararcter varying 10)
> >> In any Postgres Client ie psql , dbadmin III
> >> Insert into test values ( chr(146));;
> >> Query returned successfully: 1 rows affected, 32 ms execution time.
> >> copy test to '/tmp/testfile.txt';
> >> Query returned successfully: 1 rows affected, 15 ms execution time.
> >> copy test from '/tmp/testfile.txt';
> >> Come on are you serious?? .. Just how does one work with completly valid
> >> data that has an ascii 128 + value ??
> >> Currently this flaw make Postgres an un-useable database technology .. Or
> >> can some-one please explain this and a possible work around .. ??
> >> Thank You
> >> ---------------------------(end of broadcast)---------------------------
> >> TIP 1: if posting/reading through Usenet, please send an appropriate
> >> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> >> message can get through to the mailing list cleanly
> Sorry But I don't agree.. Why can't Postgres store a legitimate 8 bit
> byte value that is below 255?? and treat it as text ..
> Not being able to do this this makes Postgres unusable.. for storing
> TEXT values..
> I do not know ANY other database technology that doesn't allow some form
> of storing a legitimate 8 bit byte ...
> Even the most simplest open -source database in the world (and most
> popular) can do this..
> The biggest and best (Thank you Larry) can do this ...
> Postgres can't.
> In other words You are claiming that UTF8 is actually UTF7 ....
> There are 8 bits in a byte.. not 7 .. If UTF8 can't by definition
> store 8 bits then what standard can??
UTF-8 does not accept arbitary 8 bit characters. The byte ranges UTF-8
accepts are precisely defined in the standard. If our implementation
is different from it, please let us know.
> The technology is wrong and it is incorrect... If one looks at the
> output of the copy file
> od -c then QUITE correctly the 8 bit value is stored as the value
> What then is the problem in putting this value back in the text field it
> came from ??
PostgreSQL needs to follow the standard. That's it.
SRA OSS, Inc. Japan
In response to
pgsql-bugs by date
|Next:||From: Tatsuo Ishii||Date: 2007-09-28 01:26:03|
|Subject: Re: BUG #3638: UTF8 Character encoding does NOT work|
|Previous:||From: Stéphane Schildknecht||Date: 2007-09-27 12:24:24|
|Subject: Re: CREATE USER and createuser not working the same|