Re: Unicode confusion

From: Ian Barwick <barwick(at)gmx(dot)net>
To: "Chris Palmer" <chris(dot)palmer(at)geneed(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Unicode confusion
Date: 2003-05-10 09:50:29
Message-ID: 200305101150.29725.barwick@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Saturday 10 May 2003 01:47, Chris Palmer wrote:
> Hello,
(...)
> According to *The Java Programming Language, Third Edition* (p. 138),
> "...you can use the escape sequence \uxxxx to encode Unicode characters,
> where each x is a hexadecimal digit...". Therefore, shouldn't I see "262f
> 0b87" in the hex editor? It seems I'm not getting the same stuff out that I
> am putting in. psql is not much help; it just shows wacky characters (4 of
> them: "â¯à®").
>
> Am I doing something wrong? Does something need to be set in the database
> or in the JDBC Connection object? Or am I just a confused monkey?

If it's any help, your code should work as expected. The hex data you see
(3F3F0A) is two question marks and an \n; I would guess Java is not able to
display the unicode characters in your environment and is replacing them with
'?'.

PostgreSQL stores Unicode internally as UTF-8, so if you view the
data with psql in a non-unicode-environment, you will probably be
seeing the UTF-8 byte values expressed in whatever 8 bit characters
your terminal uses.

Ian Barwick
barwick(at)gmx(dot)net

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adam Siegel 2003-05-10 15:25:16 realtime data inserts
Previous Message Keary Suska 2003-05-10 05:31:48 Re: semget: No space left on device