From: | "Brandon Aiken" <BAiken(at)winemantech(dot)com> |
---|---|
To: | <thewild(at)free(dot)fr>, <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: MSSQL to PostgreSQL : Encoding problem |
Date: | 2006-11-22 18:55:55 |
Message-ID: | F8E84F0F56445B4CB39E019EF67DACBA3C4BBD@exchsrvr.winemantech.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
It also might be a big/little endian problem, although I always thought that was platform specific, not locale specific.
Try the UCS-2-INTERNAL and UCS-4-INTERNAL codepages in iconv, which should use the two-byte or four-byte versions of UCS encoding using the system's default endian setting.
There's many Unicode codepage formats that iconv supports:
UTF-8
ISO-10646-UCS-2 UCS-2 CSUNICODE
UCS-2BE UNICODE-1-1 UNICODEBIG CSUNICODE11
UCS-2LE UNICODELITTLE
ISO-10646-UCS-4 UCS-4 CSUCS4
UCS-4BE
UCS-4LE
UTF-16
UTF-16BE
UTF-16LE
UTF-32
UTF-32BE
UTF-32LE
UNICODE-1-1-UTF-7 UTF-7 CSUNICODE11UTF7
UCS-2-INTERNAL
UCS-2-SWAPPED
UCS-4-INTERNAL
UCS-4-SWAPPED
Gee, didn't Unicode just so simplify this codepage mess? Remember when it was just ASCII, EBCDIC, ANSI, and localized codepages?
--
Brandon Aiken
CS/IT Systems Engineer
-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Arnaud Lesauvage
Sent: Wednesday, November 22, 2006 12:38 PM
To: Arnaud Lesauvage; General
Subject: Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem
Alvaro Herrera a écrit :
> Arnaud Lesauvage wrote:
>> Alvaro Herrera a écrit :
>> >Arnaud Lesauvage wrote:
>> >
>> >>mydb=# SET client_encoding TO LATIN9;
>> >>SET
>> >>mydb=# COPY statistiques.detailrecherche (log_gid,
>> >>champrecherche, valeurrecherche) FROM
>> >>'E:\\Production\\Temp\\detailrecherche_ansi.csv' CSV;
>> >>ERROR: invalid byte sequence for encoding "LATIN9": 0x00
>> >>HINT: This error can also happen if the byte sequence does
>> >>not match the encoding expected by the server, which is
>> >>controlled by "client_encoding".
>> >
>> >Huh, why do you have a "0x00" byte in there? That's certainly not
>> >Latin9 (nor UTF8 as far as I know).
>> >
>> >Is the file actually Latin-something or did you convert it to something
>> >else at some point?
>>
>> This is the file generated by DTS with "ANSI" encoding. It
>> was not altered in any way after that !
>> The doc states that ANSI exports with the local codepage
>> (which is Win1252). That's all I know. :(
>
> I thought Win1252 was supposed to be almost the same as Latin1. While
> I'd expect certain differences, I wouldn't expect it to use 0x00 as
> data!
>
> Maybe you could have DTS export Unicode, which would presumably be
> UTF-16, then recode that to something else (possibly UTF-8) with GNU
> iconv.
UTF-16 ! That's something I haven't tried !
I'll try an iconv conversion tomorrow from UTF16 to UTF8 !
--
Arnaud
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Harris | 2006-11-22 18:56:23 | Re: Shutting down a warm standby database in 8.2beta3 |
Previous Message | Tom Lane | 2006-11-22 18:52:51 | Re: Buffer overflow in psql |