Re: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows

From: Meetesh Karia <meetesh(dot)karia(at)gmail(dot)com>
To: meetesh(dot)karia(at)alumni(dot)duke(dot)edu
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows
Date: 2008-03-12 13:37:16
Message-ID: 47D7DC8C.7000204@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

One quick addition to this:

The column I'm creating this unique index on is a varchar(255) and the
command I was running was:

create unique index foo_name on foo (name);

If I use the following, it now works:

create unique index foo_name on foo (cast(name as bytea));

Thoughts?

Meetesh

Meetesh Karia wrote:
> Hi all,
>
> I'm trying to migrate from 8.0.14 on Windows (Vista Home Premium) to
> 8.3.0 and I've been trying to solve what appears to be an encoding
> problem. My old db was in the UNICODE encoding. I know that this
> isn't supported on 8.0.x, but it was a restore of a db from a Linux
> environment and postgres didn't appear to have any problems with it.
>
> My 8.3 server and client encodings are UTF8 and I used pg_dumpall (I
> tried the 8.0 and 8.3 versions) to dump the db. However, when I tried
> to restore the db, I got an error during index creation which wouldn't
> let me create a unique index on a column that had all unique values
> (it had the index in 8.0 and a group by having query with no indexes
> on the table confirms uniqueness). The thing that this column does
> have however is values like:
>
> 'Bruehl'
> 'Brühl'
>
> I created a blank table with the unique index on it and inserted rows
> one at a time until I confirmed that it was the above values that were
> causing a problem. Running the following query shows the difference
> in the hex encoded values (I changed my client encoding to WIN1250 to
> get the below to show up correctly):
>
> select name, encode(decode(name, 'escape'), 'hex') from ...
>
> name | encode
> ---------------+----------------------------
> Daniel Brühl | 44616e69656c204272c3bc686c
> Daniel Bruehl | 44616e69656c2042727565686c
> (2 rows)
>
> I've also tried exporting using an encoding of WIN1250 but I get
> errors like this:
>
> pg_dump: Error message from server: ERROR: character 0xc383 of
> encoding "UNICODE" has no equivalent in "WIN1250"
>
> Anyone have any thoughts or suggestions? Why would the index creation
> fail? Is there a workaround?
>
> Thanks,
> Meetesh

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2008-03-12 14:47:03 Re: No initdb in Fedora 8
Previous Message Meetesh Karia 2008-03-12 12:25:14 Encoding problems with migration from 8.0.14 to 8.3.0 on Windows

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2008-03-12 14:25:02 Re: Re: [COMMITTERS] pgsql: Add: > o Add SQLSTATE severit yto PGconn return status > >
Previous Message Bruce Momjian 2008-03-12 13:22:42 Re: Re: [COMMITTERS] pgsql: Add: > o Add SQLSTATE severit yto PGconn return status > >