Skip site navigation (1) Skip section navigation (2)

Re: accented characters migraine

From: "Wright, George" <George(dot)Wright(at)infimatic(dot)com>
To: "John Gunther" <postgresql(at)bucksvsbytes(dot)com>,<pgsql-novice(at)postgresql(dot)org>
Subject: Re: accented characters migraine
Date: 2007-10-12 16:19:51
Message-ID: 51548D6D5BEB57468163194A8C1A0E98319491@MAGPTCPEXC02.na.mag-ias.net (view raw or flat)
Thread:
Lists: pgsql-novice
Putty is showing ISO-8858-1 which is Latin. I believe both client and server must be UTF-8.



-----Original Message-----
From: pgsql-novice-owner(at)postgresql(dot)org [mailto:pgsql-novice-owner(at)postgresql(dot)org] On Behalf Of John Gunther
Sent: Friday, October 12, 2007 11:59 AM
To: pgsql-novice(at)postgresql(dot)org
Subject: [NOVICE] accented characters migraine

It seems to me this ought to be simple and clearly documented but I've
spent hours researching and experimenting to no avail.

PROBLEM: Entering accented characters in psql often results in the
error: invalid byte sequence for encoding "UTF8"

ENVIRONMENT:
Client OS: Windows XP
Keyboard: United States-International
Terminal program: putty.exe, Translation: ISO-8859-1:1998 (Latin-1, West
Europe)
Server OS: Ubuntu
Server client app: psql 8.2.4
Server db app: PostgreSQL 8.2.4
pg settings:
client_encoding: UTF8
lc_collate: en_US.UTF-8
lc_ctype: en_US.UTF-8
server_encoding UTF8

initdb defaulted to UTF-8, which I need because I want ORDER BY to sort
alphabetically, not by hex code.

When I try to insert a string with an accented character, I generally
get the above error. Simple example:
template1=# \d sorttest
id     | integer
test   | text

template1=# insert into sorttest (test) values ('ã');
ERROR:  invalid byte sequence for encoding "UTF8": 0xe32729
HINT:  This error can also happen if the byte sequence does not match
the encoding expected by the server, which is controlled by
"client_encoding".

The accented character (a-tilde) is entered from the Windows keyboard
with the ~a sequence and displays properly in psql. The problem is that
the server rejects it.
Observations:
1) The Unicode hex value of a-tilde is C3 A3 but the error message says
the invalid sequence is E3 27 29. I don't know what the first byte means
but the second and third are the quote and right parenthesis characters
following the a-tilde in my insert statement.
2) At various times, data entry as above has started working in a
session but I can't figure out what I did to make it happen.
3) I tried entering the character in hex, as I understand it: insert
into sorttest (test) values (E'\xc3\xa3');
This avoids the error but the string value then displays as the 2
seemingly irrelevant characters ã (A-tilde, British pound)

It looks like I'm caught in some interaction between putty, psql and pg.
The real problem is much more grave than just manual data entry-- I'm
trying to migrate a large existing database from another pg server with:
pg_dumpall -h nnn.nnn.nnn.nnn | psql
This throws errors each time the COPY commands encounter an accented
character in the dump.

Any ideas? Is this just a bonehead mistake on my part?

John


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

In response to

Responses

pgsql-novice by date

Next:From: Oliver ElphickDate: 2007-10-12 16:45:17
Subject: Re: accented characters migraine
Previous:From: Brett MatonDate: 2007-10-12 16:19:45
Subject: Re: Problem with PG_GETARG_CSTRING

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group