Re: accented characters migraine

From: John Gunther <postgresql(at)bucksvsbytes(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: accented characters migraine
Date: 2007-10-12 17:48:29
Message-ID: 470FB36D.5020205@bucksvsbytes.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Seems to have done the trick this time. When I tried that earlier the
only difference was that accented characters displayed as gray
rectangles. It was boneheaded. Thanks.

Wright, George wrote:
> Putty is showing ISO-8858-1 which is Latin. I believe both client and server must be UTF-8.
>
>
>
> -----Original Message-----
> From: pgsql-novice-owner(at)postgresql(dot)org [mailto:pgsql-novice-owner(at)postgresql(dot)org] On Behalf Of John Gunther
> Sent: Friday, October 12, 2007 11:59 AM
> To: pgsql-novice(at)postgresql(dot)org
> Subject: [NOVICE] accented characters migraine
>
> It seems to me this ought to be simple and clearly documented but I've
> spent hours researching and experimenting to no avail.
>
> PROBLEM: Entering accented characters in psql often results in the
> error: invalid byte sequence for encoding "UTF8"
>
> ENVIRONMENT:
> Client OS: Windows XP
> Keyboard: United States-International
> Terminal program: putty.exe, Translation: ISO-8859-1:1998 (Latin-1, West
> Europe)
> Server OS: Ubuntu
> Server client app: psql 8.2.4
> Server db app: PostgreSQL 8.2.4
> pg settings:
> client_encoding: UTF8
> lc_collate: en_US.UTF-8
> lc_ctype: en_US.UTF-8
> server_encoding UTF8
>
> initdb defaulted to UTF-8, which I need because I want ORDER BY to sort
> alphabetically, not by hex code.
>
> When I try to insert a string with an accented character, I generally
> get the above error. Simple example:
> template1=# \d sorttest
> id | integer
> test | text
>
> template1=# insert into sorttest (test) values ('ã');
> ERROR: invalid byte sequence for encoding "UTF8": 0xe32729
> HINT: This error can also happen if the byte sequence does not match
> the encoding expected by the server, which is controlled by
> "client_encoding".
>
> The accented character (a-tilde) is entered from the Windows keyboard
> with the ~a sequence and displays properly in psql. The problem is that
> the server rejects it.
> Observations:
> 1) The Unicode hex value of a-tilde is C3 A3 but the error message says
> the invalid sequence is E3 27 29. I don't know what the first byte means
> but the second and third are the quote and right parenthesis characters
> following the a-tilde in my insert statement.
> 2) At various times, data entry as above has started working in a
> session but I can't figure out what I did to make it happen.
> 3) I tried entering the character in hex, as I understand it: insert
> into sorttest (test) values (E'\xc3\xa3');
> This avoids the error but the string value then displays as the 2
> seemingly irrelevant characters ã (A-tilde, British pound)
>
> It looks like I'm caught in some interaction between putty, psql and pg.
> The real problem is much more grave than just manual data entry-- I'm
> trying to migrate a large existing database from another pg server with:
> pg_dumpall -h nnn.nnn.nnn.nnn | psql
> This throws errors each time the COPY commands encounter an accented
> character in the dump.
>
> Any ideas? Is this just a bonehead mistake on my part?
>
> John
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>
>
>

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message syan tan 2007-10-13 02:14:21 idiom for interactive client applications.
Previous Message Tom Lane 2007-10-12 16:48:35 Re: accented characters migraine