Re: Character Encoding problem

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: antony baxter <antony(dot)baxter(at)gmail(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Character Encoding problem
Date: 2008-04-07 04:34:44
Message-ID: 47F9A464.5020501@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

antony baxter wrote:

> Displaying 'input' character by character:
> Character 0 = '8211'
> Character 1 = '235'
> Character 2 = '8212'
> Character 3 = '196'
> Character 4 = '8212'
> Character 5 = '231'
> Character 6 = '8211'
> Character 7 = '937'
> Character 8 = '8212'
> Character 9 = '199'

There's your problem. Your *input* is mangled.

The above decodes to:

--e"---A"---c,--?---C,

So at some point you or some library you're using has done something
like read a utf-8 byte sequence from a file and shoved it character by
character into a String. Another possible culprit is a wrong (implicit?)
encoding conversion or cast from a byte array type to a unicode string type.

The JDBC is storing exactly what you tell it to, and the good 'ol GIGO
rule is being applied.

--
Craig Ringer

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Craig Ringer 2008-04-07 04:35:37 Re: Character Encoding problem
Previous Message Craig Ringer 2008-04-07 03:48:54 Re: Character Encoding problem