Skip site navigation (1) Skip section navigation (2)

Re: BUG #3819: UTF8 can't handle \000

From: "Franklin Schmidt" <fschmidt(at)gmail(dot)com>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #3819: UTF8 can't handle \000
Date: 2007-12-17 09:50:15
Message-ID: 7c63948f0712170150w7957adc1wa4227093b700dc17@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-bugs
On Dec 17, 2007 1:28 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> Well, I realize 0x00 is a valid ASCII value and therefore a valid UTF8
> value but we have never had anyone complain they can't store the 0x00
> character because it doesn't mean anything in ASCII.  They use bytea to
> store binary data like 0x00.


Here are a few complaints:

http://www.nabble.com/-tp9058998.html
http://www.nabble.com/-tp11750041.html
http://www.nabble.com/-tp8414157.html

I agree that storing 0x00 in a UTF8 string is weird, but I am
converting a huge database to postgres, and in a huge database, weird
things happen.  Using bytea for a text field just because one in a
million records has a 0x00 doesn't make sense to me.  I did hack
around it in my conversion code to remove the 0x00 but I expect that
anyone else who tries converting a big database to postgres will also
confront this issue.

In response to

Responses

pgsql-bugs by date

Next:From: Ronny HellgrenDate: 2007-12-17 12:01:48
Subject: BUG #3821: Wrong language at "Installation Notes"
Previous:From: Bruce MomjianDate: 2007-12-17 09:28:57
Subject: Re: BUG #3819: UTF8 can't handle \000

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group