Re: Converting from LATIN1 to UNICODE encoding?

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Scott Eade <seade(at)backstagetech(dot)com(dot)au>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Converting from LATIN1 to UNICODE encoding?
Date: 2005-09-21 14:15:21
Message-ID: 200509211615.22306.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Am Mittwoch, 21. September 2005 10:16 schrieb Scott Eade:
> Is it necessary for me to convert the database to some other encoding
> (e.g. UNICODE) before I can store non-LATIN1 characters or does
> PostgreSQL catch these and encode them somehow?

Yes and no.

> I have actually been attempting to convert a database by doing a pg_dump
> (from the LATIN1 database) followed by a pg_restore (to one created with
> the UNICODE encoding).

That is the right method.

> Seemed to work with a sparsely populated 8.0.3
> database, but I am running into all sorts of problems with 7.3.10 (e.g.
> corrupted database followed by corrupted pg_clog).

That is the symptom of a different, much bigger problem. Encoding problems
certainly never corrupt the clog.

> What about JDBC, how does it know the encoding of the data I throw at
> it?

That depends on where your throws come from. As you surely know, Java mostly
uses Unicode internally, so the data sent by the JDBC driver to the database
is in Unicode, but as to what encoding the data that you input into your Java
program has, you need to sort that out with the Java library functions that
you use to read that data.

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Alvaro Herrera 2005-09-21 14:40:32 Re: Converting from LATIN1 to UNICODE encoding?
Previous Message Chris Browne 2005-09-21 13:06:44 Re: postgresql cluster on SAN