Re: about encoding

From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: superman0920 <superman0920(at)gmail(dot)com>
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: about encoding
Date: 2012-04-02 15:55:24
Message-ID: 4F79CBEC.7080805@pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On 03/28/2012 08:00 PM, superman0920 wrote:
> hello
> i want to insert a report to postgresql,the report contain something
> Chinese characters and the postgresql is utf-8.
> the response from db is this:
> ERROR: invalid byte sequence for encoding "UTF8": 0xb1
> how can i fix it ?
>
<snark>Remove the bad character??</snark>

You need to provide a bit more information, but the likely cause is that
the data you are inserting is not UTF8 encoded. Check the system that is
generating the data you wish to insert and make sure it is set to write
its output as UTF8.

If you cannot set it to write UTF8, then you need to determine what
character-set it is using and tell PostgreSQL to use that. See "SET
client_encoding..." Note that UTF8 is the character set that PostgreSQL
is using internally but you can read and write other sets through the
client_encoding setting.

Also be aware that UTF8 is just one of many ways of encoding the various
Unicode code-points. People sometimes conversationally interchange
Unicode and UTF8 but they are not the same. Make sure that the data you
are receiving is specifically UTF8 and not just some unknown Unicode
encoding.

If you are still stuck after checking/fixing the above you could
actually have bad data in your input or, possibly, an incorrect
byte-order issue though that is unlikely in most situations.

Cheers,
Steve

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message mario martinez hernandez 2012-04-02 15:56:04
Previous Message Jan Mussler 2012-04-02 10:24:30 Segmentation fault ( after Shared Lock acquired, same table 4 times )