Quick Links

Re: Client Encoding and Latin characters

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Lee Hachadoorian <lee(dot)hachadoorian(at)gmail(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Client Encoding and Latin characters
Date:	2009-11-24 16:45:21
Message-ID:	18502.1259081121@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Lee Hachadoorian <lee(dot)hachadoorian(at)gmail(dot)com> writes:
> My database is encoded UTF8. I recently was uploading (via COPY) some
> census data which included place names with , , , and other such
> characters. The upload choked on the Latin characters. Following the
> docs, I was able to fix this with:

> SET CLIENT_ENCODING TO 'LATIN1';
> COPY table FROM 'filename';

> After which I

> SET CLIENT_ENCODING TO 'UTF8';

> I typically use COPY FROM to bulk load data. My question is, is there
> any disadvantage to setting the default client_encoding as LATIN1? I
> expect to never be dealing with Asian languages, or most of the other
> LATINx languages. If I ever try to COPY FROM data incompatible with
> LATIN1, the command will just choke, and I can pick an appropriate
> encoding and try again, right?

Uh, no. You can pretty much assume that LATIN1 will take any random
byte string; likewise for any other single-byte encoding. UTF8 as a
default is a bit safer because it's significantly more likely that it
will be able to detect non-UTF8 input.

regards, tom lane

In response to

Client Encoding and Latin characters at 2009-11-24 16:39:11 from Lee Hachadoorian

Responses

Re: Client Encoding and Latin characters at 2009-11-24 17:03:28 from Lee Hachadoorian

Browse pgsql-general by date

	From	Date	Subject
Next Message	Andrej	2009-11-24 16:51:24	Re: ora2pg and DBD::Pg
Previous Message	Lee Hachadoorian	2009-11-24 16:39:11	Client Encoding and Latin characters