BUG #1076: Unicode Errors using Copy command

From: "PostgreSQL Bugs List" <pgsql-bugs(at)postgresql(dot)org>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #1076: Unicode Errors using Copy command
Date: 2004-02-10 03:27:46
Message-ID: 20040210032746.7D014CF486E@www.postgresql.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 1076
Logged by: mike

Email address: michael_godshall(at)gmachs(dot)com

PostgreSQL version: 7.4

Operating system: Windows/Cygwin

Description: Unicode Errors using Copy command

Details:

Hello,

I have a database I upgraded from 7.3 to 7.4.1. When I restored the backups
I received some error messages while the script was restoring a few
tables(unicode errors). The tables were created successfully but had no
data in them.

I dropped the database with the errors and re-created it using sql-ascii as
the encoding, re-issued the restore command, everything was restored
successfully.

Next in Psql I did the following:
1)set client_environment = 'unicode';
2)Create Table unicode.Foo(
copied the sql statement to create one of the tables it failed to import
when the default encoding was unicode but changed the table name);
3)Insert into unicode.Foo
Select * from sql_ascii.Foo;

The statements executed without error and the data from my sql_ascii encoded
table was successfully copied into the new unicode table. I did a select *
from unicode.foo and can see the non-english punctuation in the table now.

Thus there seems to be a problem with converting sql-ascii to unicode within
the Copy command. I found a few postings in pgsql-bugs questioning whether
or not this was a problem in 7.4 but no confirmation. No word if this is
being worked on by anyone currently either.

Examples of error messages I received when issuing the Copy command are the
following:
1)
ERROR: invalid byte sequence for encoding "UNICODE": 0XE56C73
CONTEXT: COPY volume_reports_copy_of_public_table, line 18808, column
transfereename: "Vralstad"(Please note I do not know how to reproduce the
small "o" that is supposed to appear above the first letter ,a, in this
name).

2)
ERROR: Unicode characters greater than or equal to 0x10000 are not supported
CONTEXT: COPY merged_results, line 1150, column how_make_better: " ...Konig
was..."(again I do not know how to reproduce the two small dots that should
appear above the letter "o" in that name/word.

Version: Postgresql 7.4.1 on i686-pc-cygwin, compiled by GCC gcc (GCC) 3.3.1
(cygming special).
OS - Windows 2000 SP3.

I would like to make the default encoding for this database Unicode. Would
it best to do what I did above for every table in the database, drop the
original tables, rename the new versions to the same as the original name,
backup the database, restore the backup as a new database with the default
Unicode encoding?
Any other suggestions?

Mike

Browse pgsql-bugs by date

  From Date Subject
Next Message Sean Chittenden 2004-02-10 08:54:41 Expected regression test difference on Mac OSX...
Previous Message PostgreSQL Bugs List 2004-02-09 19:56:16 BUG #1075: ecpg rejects C keywords in SQL context