Re: Unable to restore dump due to client encoding issues -- or, when is SQL_ASCII really UTF8

From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Bill Moran <wmoran(at)collaborativefusion(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Unable to restore dump due to client encoding issues -- or, when is SQL_ASCII really UTF8
Date: 2007-02-27 14:30:10
Message-ID: 20070227143010.GA81580@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Feb 27, 2007 at 08:43:27AM -0500, Bill Moran wrote:
> First off, it's my understanding that with SQL_ASCII "encoding", that
> PostgreSQL does no checking for valid/invalid characters, per the docs:
> http://www.postgresql.org/docs/8.2/static/multibyte.html

Correct. As the documentation says, SQL_ASCII "is not so much a
declaration that a specific encoding is in use, as a declaration
of ignorance about the encoding."

> The beginning of the dump file I am restoring has the following:
> --
> -- PostgreSQL database dump
> --
>
> SET client_encoding = 'SQL_ASCII';
> [...]
>
> But when I try to pull the dump in with psql, I get the following errors:
> ERROR: invalid byte sequence for encoding "UTF8": 0xa0
> HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
>
> Connecting to the database and issuing "show client_encoding" shows that
> the database is indeed set to SQL_ASCII.

client_encoding doesn't show the database encoding, it shows the
client encoding; execute "show server_encoding" to see the database
encoding. You can also use "psql -l" or "\l" from within psql to
see all databases and their encodings. The error suggests that the
database encoding is UTF8.

> Now ... I'm expecting the server to accept any byte sequence, since we're
> using SQL_ANSII, but that is (obviously) not the case. Am I missing
> something obvious here? Grepping the entire dump file shows absolutely
> no references to UTF8 ... so why is the server trying to validate the
> byte string as UTF8?

Probably because the database is UTF8 (see above). Either create
the database as SQL_ASCII (see createdb's -E option) or change the
client_encoding setting in the dump to whatever the encoding really
is (probably LATIN1 or WIN1252 for Western European languages).

--
Michael Fuhr

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jorge Godoy 2007-02-27 14:34:10 How to debug this crash?
Previous Message George Nychis 2007-02-27 14:13:04 Re: performance of partitioning?