Quick Links

Re: Character Encoding problem

From:	Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To:	antony baxter <antony(dot)baxter(at)gmail(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject:	Re: Character Encoding problem
Date:	2008-04-07 04:34:44
Message-ID:	47F9A464.5020501@postnewspapers.com.au
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-jdbc

antony baxter wrote:

> Displaying 'input' character by character:
> Character 0 = '8211'
> Character 1 = '235'
> Character 2 = '8212'
> Character 3 = '196'
> Character 4 = '8212'
> Character 5 = '231'
> Character 6 = '8211'
> Character 7 = '937'
> Character 8 = '8212'
> Character 9 = '199'

There's your problem. Your *input* is mangled.

The above decodes to:

--e"---A"---c,--?---C,

So at some point you or some library you're using has done something
like read a utf-8 byte sequence from a file and shoved it character by
character into a String. Another possible culprit is a wrong (implicit?)
encoding conversion or cast from a byte array type to a unicode string type.

The JDBC is storing exactly what you tell it to, and the good 'ol GIGO
rule is being applied.

--
Craig Ringer

In response to

Re: Character Encoding problem at 2008-04-07 03:48:54 from Craig Ringer

Responses

Re: Character Encoding problem at 2008-04-07 04:36:56 from Craig Ringer

Browse pgsql-jdbc by date

	From	Date	Subject
Next Message	Craig Ringer	2008-04-07 04:35:37	Re: Character Encoding problem
Previous Message	Craig Ringer	2008-04-07 03:48:54	Re: Character Encoding problem