Encoding problems with migration from 8.0.14 to 8.3.0 on Windows

From: Meetesh Karia <meetesh(dot)karia(at)gmail(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows
Date: 2008-03-12 12:25:14
Message-ID: 47D7CBAA.4090308@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

Hi all,

I'm trying to migrate from 8.0.14 on Windows (Vista Home Premium) to
8.3.0 and I've been trying to solve what appears to be an encoding
problem. My old db was in the UNICODE encoding. I know that this isn't
supported on 8.0.x, but it was a restore of a db from a Linux
environment and postgres didn't appear to have any problems with it.

My 8.3 server and client encodings are UTF8 and I used pg_dumpall (I
tried the 8.0 and 8.3 versions) to dump the db. However, when I tried
to restore the db, I got an error during index creation which wouldn't
let me create a unique index on a column that had all unique values (it
had the index in 8.0 and a group by having query with no indexes on the
table confirms uniqueness). The thing that this column does have
however is values like:

'Bruehl'
'Brühl'

I created a blank table with the unique index on it and inserted rows
one at a time until I confirmed that it was the above values that were
causing a problem. Running the following query shows the difference in
the hex encoded values (I changed my client encoding to WIN1250 to get
the below to show up correctly):

select name, encode(decode(name, 'escape'), 'hex') from ...

name | encode
---------------+----------------------------
Daniel Brühl | 44616e69656c204272c3bc686c
Daniel Bruehl | 44616e69656c2042727565686c
(2 rows)

I've also tried exporting using an encoding of WIN1250 but I get errors
like this:

pg_dump: Error message from server: ERROR: character 0xc383 of encoding
"UNICODE" has no equivalent in "WIN1250"

Anyone have any thoughts or suggestions? Why would the index creation
fail? Is there a workaround?

Thanks,
Meetesh

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Meetesh Karia 2008-03-12 13:37:16 Re: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows
Previous Message Peter Kovacs 2008-03-12 09:57:10 Re: No initdb in Fedora 8

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2008-03-12 13:18:41 Re: BUG #4027: backslash escaping not disabled in plpgsql
Previous Message Gregory Stark 2008-03-12 10:00:23 Re: [COMMITTERS] pgsql: Add: > o Add SQLSTATE severity to PGconn return status > >