From: | "Michael Lewis" <mikelikespie(at)gmail(dot)com> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | BUG #5532: Valid UTF8 sequence errors as invalid |
Date: | 2010-06-30 08:42:25 |
Message-ID: | 201006300842.o5U8gPHY060899@wwwmaster.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The following bug has been logged online:
Bug reference: 5532
Logged by: Michael Lewis
Email address: mikelikespie(at)gmail(dot)com
PostgreSQL version: 9.0 trunk
Operating system: OS X
Description: Valid UTF8 sequence errors as invalid
Details:
I'm using Python to sanitize my logs from invalid UTF8 characters before
COPYing them into postgres. I came across this one sequence that seems to
be valid UTF8 (in the extended range I believe).
It goes through both pythons encoding as well as iconv without an error and
is valid as far as my understanding of UTF8 goes so I am assuming it is a
bug.
Test case:
create table t (v varchar);
insert into t values (E'\xed\xbc\xad');
In bash you can do:
echo -e "\xed\xbc\xad" | iconv -f UTF-8 ; echo $?
to validate it
Thanks,
Mike
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2010-06-30 10:11:43 | Re: [BUGS] Server crash while trying to read expression using pg_get_expr() |
Previous Message | Marcel Asio | 2010-06-30 06:06:30 | Re: Function works in 8.4 but not in 9.0 beta2 "ERROR: structure of query does not match function result type" |